Jaro¶
Functions¶
distance¶
- rapidfuzz.distance.Jaro.distance(s1, s2, *, processor=None, score_cutoff=None)¶
Calculates the jaro distance
- Parameters:
s1 (Sequence[Hashable]) – First string to compare.
s2 (Sequence[Hashable]) – Second string to compare.
processor (callable, optional) – Optional callable that is used to preprocess the strings before comparing them. Default is None, which deactivates this behaviour.
score_cutoff (float, optional) – Optional argument for a score threshold as a float between 0 and 1.0. For ratio < score_cutoff 0 is returned instead. Default is None, which deactivates this behaviour.
- Returns:
distance – distance between s1 and s2 as a float between 1.0 and 0.0
- Return type:
float
normalized_distance¶
- rapidfuzz.distance.Jaro.normalized_distance(s1, s2, *, processor=None, score_cutoff=None)¶
Calculates the normalized jaro distance
- Parameters:
s1 (Sequence[Hashable]) – First string to compare.
s2 (Sequence[Hashable]) – Second string to compare.
processor (callable, optional) – Optional callable that is used to preprocess the strings before comparing them. Default is None, which deactivates this behaviour.
score_cutoff (float, optional) – Optional argument for a score threshold as a float between 0 and 1.0. For ratio < score_cutoff 0 is returned instead. Default is None, which deactivates this behaviour.
- Returns:
normalized distance – normalized distance between s1 and s2 as a float between 1.0 and 0.0
- Return type:
float
similarity¶
- rapidfuzz.distance.Jaro.similarity(s1, s2, *, processor=None, score_cutoff=None)¶
Calculates the jaro similarity
- Parameters:
s1 (Sequence[Hashable]) – First string to compare.
s2 (Sequence[Hashable]) – Second string to compare.
processor (callable, optional) – Optional callable that is used to preprocess the strings before comparing them. Default is None, which deactivates this behaviour.
score_cutoff (float, optional) – Optional argument for a score threshold as a float between 0 and 1.0. For ratio < score_cutoff 0 is returned instead. Default is None, which deactivates this behaviour.
- Returns:
similarity – similarity between s1 and s2 as a float between 0 and 1.0
- Return type:
float
normalized_similarity¶
- rapidfuzz.distance.Jaro.normalized_similarity(s1, s2, *, processor=None, score_cutoff=None)¶
Calculates the normalized jaro similarity
- Parameters:
s1 (Sequence[Hashable]) – First string to compare.
s2 (Sequence[Hashable]) – Second string to compare.
processor (callable, optional) – Optional callable that is used to preprocess the strings before comparing them. Default is None, which deactivates this behaviour.
score_cutoff (float, optional) – Optional argument for a score threshold as a float between 0 and 1.0. For ratio < score_cutoff 0 is returned instead. Default is None, which deactivates this behaviour.
- Returns:
normalized similarity – normalized similarity between s1 and s2 as a float between 0 and 1.0
- Return type:
float
Performance¶
The following image shows a benchmark of the Jaro similarity in RapidFuzz
and jellyfish. Jellyfish uses an implementation with a time complexity of O(NM)
,
while RapidFuzz has a time complexity of O([N/64]M)
.