Prefix

Functions

distance

rapidfuzz.distance.Prefix.distance(s1, s2, *, processor=None, score_cutoff=None)

Calculates the Prefix distance between two strings.

Parameters:
  • s1 (Sequence[Hashable]) – First string to compare.

  • s2 (Sequence[Hashable]) – Second string to compare.

  • processor (callable, optional) – Optional callable that is used to preprocess the strings before comparing them. Default is None, which deactivates this behaviour.

  • score_cutoff (int or None, optional) – Maximum distance between s1 and s2, that is considered as a result. If the distance is bigger than score_cutoff, score_cutoff + 1 is returned instead. Default is None, which deactivates this behaviour.

Returns:

distance – distance between s1 and s2

Return type:

int

normalized_distance

rapidfuzz.distance.Prefix.normalized_distance(s1, s2, *, processor=None, score_cutoff=None)

Calculates a normalized prefix similarity in the range [1, 0].

This is calculated as distance / (len1 + len2).

Parameters:
  • s1 (Sequence[Hashable]) – First string to compare.

  • s2 (Sequence[Hashable]) – Second string to compare.

  • processor (callable, optional) – Optional callable that is used to preprocess the strings before comparing them. Default is None, which deactivates this behaviour.

  • score_cutoff (float, optional) – Optional argument for a score threshold as a float between 0 and 1.0. For norm_dist > score_cutoff 1.0 is returned instead. Default is 1.0, which deactivates this behaviour.

Returns:

norm_dist – normalized distance between s1 and s2 as a float between 0 and 1.0

Return type:

float

similarity

rapidfuzz.distance.Prefix.similarity(s1, s2, *, processor=None, score_cutoff=None)

Calculates the prefix similarity between two strings.

This is calculated as len1 - distance.

Parameters:
  • s1 (Sequence[Hashable]) – First string to compare.

  • s2 (Sequence[Hashable]) – Second string to compare.

  • processor (callable, optional) – Optional callable that is used to preprocess the strings before comparing them. Default is None, which deactivates this behaviour.

  • score_cutoff (int, optional) – Maximum distance between s1 and s2, that is considered as a result. If the similarity is smaller than score_cutoff, 0 is returned instead. Default is None, which deactivates this behaviour.

Returns:

distance – distance between s1 and s2

Return type:

int

normalized_similarity

rapidfuzz.distance.Prefix.normalized_similarity(s1, s2, *, processor=None, score_cutoff=None)

Calculates a normalized prefix similarity in the range [0, 1].

This is calculated as 1 - normalized_distance

Parameters:
  • s1 (Sequence[Hashable]) – First string to compare.

  • s2 (Sequence[Hashable]) – Second string to compare.

  • processor (callable, optional) – Optional callable that is used to preprocess the strings before comparing them. Default is None, which deactivates this behaviour.

  • score_cutoff (float, optional) – Optional argument for a score threshold as a float between 0 and 1.0. For norm_sim < score_cutoff 0 is returned instead. Default is 0, which deactivates this behaviour.

Returns:

norm_sim – normalized similarity between s1 and s2 as a float between 0 and 1.0

Return type:

float