1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
/*! A lower level API for packed multiple substring search, principally for a small number of patterns. This sub-module provides vectorized routines for quickly finding matches of a small number of patterns. In general, users of this crate shouldn't need to interface with this module directory, as the primary [`AhoCorasick`](../struct.AhoCorasick.html) searcher will use these routines automatically as a prefilter when applicable. However, in some cases, callers may want to bypass the Aho-Corasick machinery entirely and use this vectorized searcher directly. # Overview The primary types in this sub-module are: * [`Searcher`](struct.Searcher.html) executes the actual search algorithm to report matches in a haystack. * [`Builder`](struct.Builder.html) accumulates patterns incrementally and can construct a `Searcher`. * [`Config`](struct.Config.html) permits tuning the searcher, and itself will produce a `Builder` (which can then be used to build a `Searcher`). Currently, the only tuneable knob are the match semantics, but this may be expanded in the future. # Examples This example shows how to create a searcher from an iterator of patterns. By default, leftmost-first match semantics are used. (See the top-level [`MatchKind`](../enum.MatchKind.html) type for more details about match semantics, which apply similarly to packed substring search.) ``` use aho_corasick::packed::{MatchKind, Searcher}; # fn example() -> Option<()> { let searcher = Searcher::new(["foobar", "foo"].iter().cloned())?; let matches: Vec<usize> = searcher .find_iter("foobar") .map(|mat| mat.pattern()) .collect(); assert_eq!(vec![0], matches); # Some(()) } # if cfg!(target_arch = "x86_64") { # example().unwrap() # } else { # assert!(example().is_none()); # } ``` This example shows how to use [`Config`](struct.Config.html) to change the match semantics to leftmost-longest: ``` use aho_corasick::packed::{Config, MatchKind}; # fn example() -> Option<()> { let searcher = Config::new() .match_kind(MatchKind::LeftmostLongest) .builder() .add("foo") .add("foobar") .build()?; let matches: Vec<usize> = searcher .find_iter("foobar") .map(|mat| mat.pattern()) .collect(); assert_eq!(vec![1], matches); # Some(()) } # if cfg!(target_arch = "x86_64") { # example().unwrap() # } else { # assert!(example().is_none()); # } ``` # Packed substring searching Packed substring searching refers to the use of SIMD (Single Instruction, Multiple Data) to accelerate the detection of matches in a haystack. Unlike conventional algorithms, such as Aho-Corasick, SIMD algorithms for substring search tend to do better with a small number of patterns, where as Aho-Corasick generally maintains reasonably consistent performance regardless of the number of patterns you give it. Because of this, the vectorized searcher in this sub-module cannot be used as a general purpose searcher, since building the searcher may fail. However, in exchange, when searching for a small number of patterns, searching can be quite a bit faster than Aho-Corasick (sometimes by an order of magnitude). The key take away here is that constructing a searcher from a list of patterns is a fallible operation. While the precise conditions under which building a searcher can fail is specifically an implementation detail, here are some common reasons: * Too many patterns were given. Typically, the limit is on the order of 100 or so, but this limit may fluctuate based on available CPU features. * The available packed algorithms require CPU features that aren't available. For example, currently, this crate only provides packed algorithms for `x86_64`. Therefore, constructing a packed searcher on any other target (e.g., ARM) will always fail. * Zero patterns were given, or one of the patterns given was empty. Packed searchers require at least one pattern and that all patterns are non-empty. * Something else about the nature of the patterns (typically based on heuristics) suggests that a packed searcher would perform very poorly, so no searcher is built. */ pub use packed::api::{Builder, Config, FindIter, MatchKind, Searcher}; mod api; mod pattern; mod rabinkarp; mod teddy; #[cfg(test)] mod tests; #[cfg(target_arch = "x86_64")] mod vector;