Up the Down Codebase: Shell scripting technique for finding unique strings

Thursday, September 2, 2010

Shell scripting technique for finding unique strings

I recently had to search through a bunch of log files to find a bunch of entries and count how many times they occurred. Shell scripting (via Cygwin) to the rescue!

I was looking for strings in this format

calc_name=RegularIRA
calc_name=Savings

Here is the solution:

grep -oh "calc_name=\w*" * | sort | uniq -c > calculator_counts.txt

This searches all files in the current directory for the pattern "calc_name=\w*" (which stops as soon as a non word character (like a symbol) is found. Then it sorts them, and runs the "uniq" command to get a count of unique occurrences. Then the output is piped to a file.

The output looks like this:

1332 Annuity
  59 AssetAllocator
4411 AutoEquityLoan
 119 AutoLoan
   4 AutoPayoff
 333 AutoRebate

2 comments:

Jeff OlsonSeptember 2, 2010 at 12:38 PM
I should also note, if you want to use Perl-compatible regular expressions, add the "-P" flag to the "grep" command, like so:

grep -Poh "calc_name=[\w.]*"
ReplyDelete
Replies
UnknownSeptember 22, 2010 at 8:55 AM
That is why you were johnny on the spot with the suggestion earlier... :)
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

Up the Down Codebase

Thursday, September 2, 2010

Shell scripting technique for finding unique strings

2 comments:

Search This Blog

About Me

Blog Archive

Labels

TypeRacer

Links

Up the Down Codebase

Thursday, September 2, 2010

Shell scripting technique for finding unique strings

2 comments:

Search This Blog

About Me

Blog Archive

Labels

TypeRacer

Links

Subscribe To