SPT - Pattern Permutation Design
I. Introduction
This page describes the algorithm used to find all patterns permutation of sub-term from a given input. A pattern is one of possible synonym permutation (please see the example for details).
II. Algorithm
- Find all sub-terms of the input that has synonyms
- Sort by the order of start index, then end index
- subTerms: terms has synonyms
- Find all patterns
- parameters:
- inWordIndex: the index of InWords
- subTermObjIndex: the index of SubTerm Objs
- subTermObj: SubTerm object includes: sub term|start index|end Index
- startIndex: start index of SubTerm object[SubTerm Index]
- prevStartIndex: previous start index of SubTerm Object[SubTerm Index-1]
- patterns: pattern of terms|synonyms
- baseTermsWordCount: word count of the base terms in the pattern
- Initiation
- inWordIndex = 0
- subTermIndex = 0
- patterns = new Vector<Pattern>();
- Go through each InWord (word from input) and perform following actions:
- Go through each subTerm (terms have synonyms)
- CASE-I: current inWord is not a subTerm (starting word)
| Conditions | Actions
|
|---|
- subTermObjIndex >= subTerms.size() or
current inWord is after the last subTerms
- inWordIndex != startIndex
current inWord is not a subTerm
|
- add inWord (no synonyms) to all patterns
- move to next inWord (inWordIndex++)
|
- CASE-II: current inWord is a subTerm (starting word)
- subTermObjIndex < subTerms.size() and
- inWordIndex == startIndex
- instantiate synonyms with current subTermObj
-
| CASE | Conditions | Actions
|
|---|
| II.1 |
- subTermObj.GetEndIndex()-1 == subTermObj.GetStartIndex()
=> the base form of subTerm is a single word
=> this word must be the first subTerm starting with current inWord (sorted)
|
- add synonyms to all patterns
|
|---|
| II.2 |
- subTermObj.GetEndIndex()-1 != subTermObj.GetStartIndex()
=> base term of subTerm is a multi-words
|
- add inWord to all patterns
- get last patterns
- get all patterns with last element removed
- add to lastPatterns if
- startIndex == totalBaseTermsWordCount
=>same position for subTerm replacement
- pattern does not exist in lastPatterns
- add synonyms to all lastPatterns
- add all lastPatterns to patterns
|
|---|
- move to next inWord (inWordIndex++) if current inWord is not the next subTerm (startIndex != nextStartIndex)
- move to next subTerm (subTermObjIndex++) if subTerm is not the last one (subTermObjIndex < subTerms.size( ))
- Add synonyms (Vector<String> with a single element, single word) to patterns
| Condition | Action
|
|---|
patterns.size() == 0 =>no pattern exist, first word |
- instantiate a new pattern
- add synonyms to this new pattern
- add this new pattern to patterns
|
patterns.size() > 0 pattern exist, not the first word |
- go through all pattern
- add synonyms to each pattern if the last element does not contains the subTerm (of this single inWord)
|
III. Java Classes & Method
- Pattern.java: a Java class for pattern object
- PatternPermutation.java: a Java class for pattern permutation
public Vector<pattern> FindPatternPermutation(String inTerm,
SynonymsMapping synonymsMapping, TrieTreeMatch trieTreeMatch,
boolean recursiveFlag)
IV. Examples
- Input: Synonym Rules
| word | synonym
|
|---|
| dog | canine
|
| cat | feline
|
| canine | K9
|
| K9 | bull dog
|
| Dog and cat | pets
|
| puppy and kitty | pets
|
- Input: Synonym Terms
| Terms
|
|---|
| dog
|
| canine
|
| cat
|
| feline
|
| k9
|
| bull dog
|
| dog and cat
|
| pets
|
| puppy and kitty
|
- Input Term:
Who let dog and cat out
| InWord Index | 0 | 1 | 2 | 3 | 4 | 5
|
|---|
| InWord | who | let | dog | and | cat | out
|
|---|
- Permutation Algorithm Example
- Find all sub-terms of the input that has synonyms from
previous section
- Sort by the order of start index, then end index
-
| SubTerm Index | Sub Term | Start Index | End Index
|
|---|
| 0 | dog | 2 | 3
|
| 1 | dog and cat | 2 | 5
|
| 2 | cat | 4 | 5
|
- Find all patterns
| InWord Index | InWord | SubTerm Index | SubTerm Obj | Start Index | Prev Start Index | Case | Patterns Obj
|
|---|
| 0 | who | 0 | dog|2|3 | 2 | -1 | I |
|
|---|
| 1 | let | 0 | dog|2|3 | 2 | -1 | I |
|
|---|
| 2 | dog | 0 | dog|2|3 | 2 | -1 | II.1 | - pattern-1:
- who
- let|
- dog|canine|k9|bull dog|
- [2]
| 2 | dog | 1 | dog and cat|2|5 | 2 | 2 | II.2 |
- pattern-1:
- who
- let|
- dog|canine|k9|bull dog|
- [2]
- pattern-2:
- who
- let|
- dog and cat|pets|puppy and kitty|
- [5]
| 3 | and | 2 | cat|4|5 | 4 | 2 | I |
- pattern-1:
- who
- let|
- dog|canine|k9|bull dog|
- and|
- [4]
- pattern-2:
- who
- let|
- dog and cat|pets|puppy and kitty|
- [5]
| 4 | cat | 2 | cat|4|5 | 4 | 2 | II.1 |
- pattern-1:
- who
- let|
- dog|canine|k9|bull dog|
- and|
- cat|feline|
- [5]
- pattern-2:
- who
- let|
- dog and cat|pets|puppy and kitty|
- [5]
| 5 | out | 3 | null | -1 | 4 | I |
- pattern-1:
- who
- let|
- dog|canine|k9|bull dog|
- and|
- cat|feline|
- out
- [6]
- pattern-2:
- who
- let|
- dog and cat|pets|puppy and kitty|
- out
- [6]
|
|---|
|
|---|
|
|---|
|
|---|
|
|---|
- Outputs:
| Pattern 1
|
|---|
| word|synonyms
|
|---|
| who|
|
| let|
|
| dog|canine|k9|bull dog|
|
| and|
|
| cat|feline|
|
| out|
|
| Pattern 2
|
|---|
| word|synonyms
|
|---|
| who|
|
| let|
|
| dog and cat|pets|puppy and kitty|
|
| out|
|