distance¶
Editop¶
- class rapidfuzz.distance.Editop¶
Tuple like object describing an edit operation. It is in the form (tag, src_pos, dest_pos)
The tags are strings, with these meanings:
tag
explanation
‘replace’
src[src_pos] should be replaced by dest[dest_pos]
‘delete’
src[src_pos] should be deleted
‘insert’
dest[dest_pos] should be inserted at src[src_pos]
Editops¶
- class rapidfuzz.distance.Editops¶
List like object of Editops describing how to turn s1 into s2.
- apply(source_string, destination_string)¶
apply editops to source_string
- Parameters:
source_string (str | bytes) – string to apply editops to
destination_string (str | bytes) – string to use for replacements / insertions into source_string
- Returns:
mod_string – modified source_string
- Return type:
str
- as_list()¶
Convert Editops to a list of tuples.
This is the equivalent of
[x for x in editops]
- as_matching_blocks()¶
Convert to matching blocks
- Returns:
matching blocks – Editops converted to matching blocks
- Return type:
list[MatchingBlock]
- as_opcodes()¶
Convert to Opcodes
- Returns:
opcodes – Editops converted to Opcodes
- Return type:
- copy()¶
performs copy of Editops
- classmethod from_opcodes(opcodes)¶
Create Editops from Opcodes
- inverse()¶
Invert Editops, so it describes how to transform the destination string to the source string.
- Returns:
editops – inverted Editops
- Return type:
Examples
>>> from rapidfuzz.distance import Levenshtein >>> Levenshtein.editops('spam', 'park') [Editop(tag=delete, src_pos=0, dest_pos=0), Editop(tag=replace, src_pos=3, dest_pos=2), Editop(tag=insert, src_pos=4, dest_pos=3)]
>>> Levenshtein.editops('spam', 'park').inverse() [Editop(tag=insert, src_pos=0, dest_pos=0), Editop(tag=replace, src_pos=2, dest_pos=3), Editop(tag=delete, src_pos=3, dest_pos=4)]
Opcode¶
- class rapidfuzz.distance.Opcode¶
Tuple like object describing an edit operation. It is in the form (tag, src_start, src_end, dest_start, dest_end)
The tags are strings, with these meanings:
tag
explanation
‘replace’
src[src_start:src_end] should be replaced by dest[dest_start:dest_end]
‘delete’
src[src_start:src_end] should be deleted. Note that dest_start==dest_end in this case.
‘insert’
dest[dest_start:dest_end] should be inserted at src[src_start:src_start]. Note that src_start==src_end in this case.
‘equal’
src[src_start:src_end] == dest[dest_start:dest_end]
Note
Opcode is compatible with the tuples returned by difflib’s SequenceMatcher to make them interoperable
Opcodes¶
- class rapidfuzz.distance.Opcodes¶
List like object of Opcodes describing how to turn s1 into s2. The first Opcode has src_start == dest_start == 0, and remaining tuples have src_start == the src_end from the tuple preceding it, and likewise for dest_start == the previous dest_end.
- apply(source_string, destination_string)¶
apply opcodes to source_string
- Parameters:
source_string (str | bytes) – string to apply opcodes to
destination_string (str | bytes) – string to use for replacements / insertions into source_string
- Returns:
mod_string – modified source_string
- Return type:
str
- as_editops()¶
Convert to Editops
- Returns:
editops – Opcodes converted to Editops
- Return type:
- as_list()¶
Convert Opcodes to a list of tuples, which is compatible with the opcodes of difflibs SequenceMatcher.
This is the equivalent of
[x for x in opcodes]
- as_matching_blocks()¶
Convert to matching blocks
- Returns:
matching blocks – Opcodes converted to matching blocks
- Return type:
list[MatchingBlock]
- copy()¶
performs copy of Opcodes
- classmethod from_editops(editops)¶
Create Opcodes from Editops
- inverse()¶
Invert Opcodes, so it describes how to transform the destination string to the source string.
- Returns:
opcodes – inverted Opcodes
- Return type:
Examples
>>> from rapidfuzz.distance import Levenshtein >>> Levenshtein.opcodes('spam', 'park') [Opcode(tag=delete, src_start=0, src_end=1, dest_start=0, dest_end=0), Opcode(tag=equal, src_start=1, src_end=3, dest_start=0, dest_end=2), Opcode(tag=replace, src_start=3, src_end=4, dest_start=2, dest_end=3), Opcode(tag=insert, src_start=4, src_end=4, dest_start=3, dest_end=4)]
>>> Levenshtein.opcodes('spam', 'park').inverse() [Opcode(tag=insert, src_start=0, src_end=0, dest_start=0, dest_end=1), Opcode(tag=equal, src_start=0, src_end=2, dest_start=1, dest_end=3), Opcode(tag=replace, src_start=2, src_end=3, dest_start=3, dest_end=4), Opcode(tag=delete, src_start=3, src_end=4, dest_start=4, dest_end=4)]