Data Structure - Trie
Trie (digital tree, radix tree or prefix tree) is a tree data structure
that is used to store a set of strings or string/value pairs.
Each node of the tree corresponds to a prefix of a string from the set.
The root of the tree represents an empty prefix.
All the descendants of a node share the common prefix associated with that node.
Trie can be used to efficiently perform the following operations:
Trie is often used as a data structure to store dictionary of words. It is often found in
the following applications: full-text index, syntax correction programs, auto suggestion lists.
- Check if a given string belongs to the set.
- Find all strings that start from a given prefix.
- Find all strings that match given pattern (wildcard or regular expression).
- Find all approximate matches from a given string, i.e. implement fuzzy search.
Nodes in the trie can have a lot of children. For a UTF-32 encoded strings, each node
might have up to 65536 children. This is unfortunate and in practical applications strings
are usually encoded using variable length encoding schemes. For example,
Ternary search tree is a balance between binary search tree and trie: it is
space efficient and support the same operations as trie. Each node in ternary search
tree can have up to three children.
Both implemented data structures support fuzzy search based on
see LevenshteinMatcher class in Trie.cs.
Trie build from words: