FOSSology  4.4.0
Open Source License Compliance by Open Source Software
doctorBuffer_utils.c File Reference

Doctor buffer utilities for debugging. More...

#include "doctorBuffer_utils.h"
#include "nomos.h"
#include "list.h"
#include "util.h"
#include "nomos_regex.h"
Include dependency graph for doctorBuffer_utils.c:

Go to the source code of this file.

Macros

#define INVISIBLE   (int) '\377'
 

Functions

int compressDoctoredBuffer (char *textBuffer)
 garbage collect: eliminate all INVISIBLE characters in the buffer More...
 
void removeHtmlComments (char *buf)
 Remove HTML comments from buffer without removing comment text. More...
 
void removeLineComments (char *buf)
 Remove comments that start at the beginning of a line. More...
 
void cleanUpPostscript (char *buf)
 Remove newlines from buffer. More...
 
void removeBackslashesAndGTroffIndicators (char *buf)
 Remove groff/troff font-size indicators, the literal string backslash-n and all backslahes, ala. More...
 
void convertWhitespaceToSpaceAndRemoveSpecialChars (char *buf, int isCR)
 Convert white-space to real spaces, and remove unnecessary punctuation. More...
 
void dehyphen (char *buf)
 
void removePunctuation (char *buf)
 Clean up miscellaneous punctuation. More...
 
void ignoreFunctionCalls (char *buf)
 Ignore function calls to print routines. More...
 
void convertSpaceToInvisible (char *buf)
 
void doctorBuffer (char *buf, int isML, int isPS, int isCR)
 Convert a buffer of multiple stuff to text-only, separated by spaces. More...
 

Detailed Description

Doctor buffer utilities for debugging.

Definition in file doctorBuffer_utils.c.

Function Documentation

◆ cleanUpPostscript()

void cleanUpPostscript ( char *  buf)

Remove newlines from buffer.

Parameters
[in,out]buf

Definition at line 220 of file doctorBuffer_utils.c.

◆ compressDoctoredBuffer()

int compressDoctoredBuffer ( char *  textBuffer)

garbage collect: eliminate all INVISIBLE characters in the buffer

Parameters
[in,out]textBufferBuffer to compress
Returns
Size difference between orifinal and compressed buffer

Definition at line 25 of file doctorBuffer_utils.c.

◆ convertSpaceToInvisible()

void convertSpaceToInvisible ( char *  buf)

Convert the regex ' [X ]+' (where X is really the character #defined as INVISIBLE) to a single space (and a string of INVISIBLE characters).

Parameters
[in,out]buf

Definition at line 530 of file doctorBuffer_utils.c.

◆ convertWhitespaceToSpaceAndRemoveSpecialChars()

void convertWhitespaceToSpaceAndRemoveSpecialChars ( char *  buf,
int  isCR 
)

Convert white-space to real spaces, and remove unnecessary punctuation.

‘tr -d ’*=+#$|%.,:;!?()\][\140\047\042' | tr '\011\012\015' ' '`

Parameters
[in,out]buf
Note
We purposely do NOT process backspace-characters here. Perhaps there's an improvement in the wings for this?

Definition at line 285 of file doctorBuffer_utils.c.

◆ dehyphen()

void dehyphen ( char *  buf)

Look for hyphenations of words, to compress both halves into a single (sic) word. Regex == "[a-z]- [a-z]".

Parameters
[in,out]buf
Note
Not sure this will work based on the way we strip punctuation out of the buffer above – work on this later.

Definition at line 426 of file doctorBuffer_utils.c.

◆ doctorBuffer()

void doctorBuffer ( char *  buf,
int  isML,
int  isPS,
int  isCR 
)

Convert a buffer of multiple stuff to text-only, separated by spaces.

The steps followed in this function are:

  1. Filter HTML/XML comments using removeHtmlComments()
  2. Filter code comments using removeLineComments()
  3. Filter post scripts using cleanUpPostscript()
  4. Filter groff/troff using removeBackslashesAndGTroffIndicators()
  5. Filter spaces and special characters using convertWhitespaceToSpaceAndRemoveSpecialChars()
  6. Filter hyphen strings using dehyphen()
  7. Filter punctuation using removePunctuation()
  8. Ignore print routines using ignoreFunctionCalls()
  9. Filter spaces using convertSpaceToInvisible()
  10. Compress the buffer using compressDoctoredBuffer()
    Parameters
    [in,out]bufBuffer to filter
    [in]isMLBuffer contains HTML/XML data
    [in]isPSBuffer contains post script data
    [in]isCR

Definition at line 575 of file doctorBuffer_utils.c.

◆ ignoreFunctionCalls()

void ignoreFunctionCalls ( char *  buf)

Ignore function calls to print routines.

Only concentrate on what's being printed (sometimes programs do print licensing information) – but don't ignore real words that END in 'print', like footprint and fingerprint.

Here, we take a risk and just look for a 't' (in "footprint"), or for an 'r' (in "fingerprint"). If someone has ever coded a print routine that is named 'rprint' or tprint', we're spoofed.

Parameters
[in,out]buf

Definition at line 505 of file doctorBuffer_utils.c.

◆ removeBackslashesAndGTroffIndicators()

void removeBackslashesAndGTroffIndicators ( char *  buf)

Remove groff/troff font-size indicators, the literal string backslash-n and all backslahes, ala.

‘perl -pe 's,\s[+-][0-9]*,,g;s,\s[0-9]*,,g;s/\n//g;’ | f`

Parameters
[in,out]buf

Definition at line 246 of file doctorBuffer_utils.c.

◆ removeHtmlComments()

void removeHtmlComments ( char *  buf)

Remove HTML comments from buffer without removing comment text.

Parameters
[in,out]buf

Definition at line 43 of file doctorBuffer_utils.c.

◆ removeLineComments()

void removeLineComments ( char *  buf)

Remove comments that start at the beginning of a line.

Comments like *, ^dnl, ^xcomm, ^comment, and // preserving the comment text

Parameters
[in,out]buf

when MODULE_LICENSE("GPL") is outcommented, do not get rid of this line.

Definition at line 120 of file doctorBuffer_utils.c.

◆ removePunctuation()

void removePunctuation ( char *  buf)

Clean up miscellaneous punctuation.

‘perl -pe 's,[-_/]+ , ,g;s/print[_a-zA-Z]* //g;s/ / /g;’`

Parameters
[in,out]buf

Definition at line 467 of file doctorBuffer_utils.c.