I'm not sure it's faster but you can write code that puts all the patterns together into one long pattern similar to this code below. Time it and see if in your case it is an improvement...or that it even works for you. It uses "|" to separate different patterns so if any of your regular expression already relies on a pipe, this will not work as expected.
In my case, this code was for finding name prefixes (Mr., Mrs., Sr. , Sra., etc.) and suffixes (Jr., Sr., II, III, etc.) and the resulting String representation of the pattern would look something like:
"(?: |,|^)(jr)(?: |,|$)|(?: |,|^)(jr.)(?: |,|$)|(?: |,|^)(sr.)(?: |,|$)|(?: |,|^)(sr)(?: |,|$)|(?: |,|^)(iii)(?: |,|$)|(?: |,|^)(iv)(?: |,|$)|"
I was also using the case insensitive flag.
---
final static String defaultBoundaryStart = "(?: |,|^)(";
final static String defaultBoundaryEnd = ")(?: |,|$)|";
Pattern createMatchPattern(String[] strings, String boundaryStart, String boundaryEnd) {
StringBuilder pattern = new StringBuilder();
for(int i = 0; i < strings.length; i++) {
pattern.append(boundaryStart + Pattern.quote(strings[i]) + boundaryEnd);
}
pattern.deleteCharAt(pattern.length()-1);
Pattern matchPattern = Pattern.compile(pattern.toString(), Pattern.CASE_INSENSITIVE);
return matchPattern;
}
Usage:
Pattern bigLongPattern = createMatchPattern(someArrayOfPatterns, defaultBoundaryStart, defaultBoundaryEnd);
To: seajug-***@public.gmane.org
From: seajug-***@public.gmane.org
Date: Wed, 14 May 2014 11:26:05 -0700
Subject: [seajug] finding multiple patterns in log file
Hi,Below is one sample pattern I want to search for in given log file:
static String splitterPattern = "^(.*) INFO (.*) wal.HLogSplitter ..."
static Pattern SPLITTER = Pattern.compile(splitterPattern);
There're 10 (or more) such patterns.I am currently iterating through the patterns for each log line.
Is there a faster way to do this ?
Thanks