Friday, December 9, 2011

Usenix: Dartmouth Updating Diff, Grep Unix Tools

IDG News Service (12/08/11) Joab Jackson

Dartmouth University researchers are updating the grep and diff Unix command line-based text analysis tools available in all Linux and Unix distributions to handle more complex types of data.  The updates are needed because "we now tend to have more model-based configuration languages that have meaningful constructs spanning more than one line," says Dartmouth graduate student Gabriel Weaver.  The researchers say the updated tools will enable administrators to extract meaningful data from configuration files, log files, and other sources of operational data.  The output from either of these programs can be linked to other utilities, enabling them to be incorporated into scripts that automate routine system administration tasks.  The new programs, called Context-Free Grep and Hierarchical Diff, will provide the ability to parse blocks of data rather than single lines.  For each new type of data structure, a vendor would provide a pattern library identifying the basic structure of the data, which the software would then use to "extract the constructs of interest from the document," Weaver says.