downloads | documentation | faq | getting help | mailing lists | reporting bugs | php.net sites | links | my php.net 
search for in the  

<lcfirstlocaleconv>
Last updated: Thu, 26 Jun 2008

levenshtein

(PHP 4 >= 4.0.1, PHP 5)

levenshtein — Calculate Levenshtein distance between two strings

Description

int levenshtein ( string $str1 , string $str2 )
int levenshtein ( string $str1 , string $str2 , int $cost_ins , int $cost_rep , int $cost_del )

The Levenshtein distance is defined as the minimal number of characters you have to replace, insert or delete to transform str1 into str2 . The complexity of the algorithm is O(m*n), where n and m are the length of str1 and str2 (rather good when compared to similar_text(), which is O(max(n,m)**3), but still expensive).

In its simplest form the function will take only the two strings as parameter and will calculate just the number of insert, replace and delete operations needed to transform str1 into str2 .

A second variant will take three additional parameters that define the cost of insert, replace and delete operations. This is more general and adaptive than variant one, but not as efficient.

Parameters

str1

One of the strings being evaluated for Levenshtein distance.

str2

One of the strings being evaluated for Levenshtein distance.

cost_ins

Defines the cost of insertion.

cost_rep

Defines the cost of replacement.

cost_del

Defines the cost of deletion.

Return Values

This function returns the Levenshtein-Distance between the two argument strings or -1, if one of the argument strings is longer than the limit of 255 characters.

Examples

Example #1 levenshtein() example

<?php
// input misspelled word
$input = 'carrrot';

// array of words to check against
$words  = array('apple','pineapple','banana','orange',
              
'radish','carrot','pea','bean','potato');

// no shortest distance found, yet
$shortest = -1;

// loop through words to find the closest
foreach ($words as $word) {

  
// calculate the distance between the input word,
   // and the current word
  
$lev = levenshtein($input, $word);

  
// check for an exact match
  
if ($lev == 0) {

      
// closest word is this one (exact match)
      
$closest = $word;
      
$shortest = 0;

      
// break out of the loop; we've found an exact match
      
break;
   }

  
// if this distance is less than the next found shortest
   // distance, OR if a next shortest word has not yet been found
  
if ($lev <= $shortest || $shortest < 0) {
      
// set the closest match, and shortest distance
      
$closest  = $word;
      
$shortest = $lev;
   }
}

echo
"Input word: $input\n";
if (
$shortest == 0) {
   echo
"Exact match found: $closest\n";
} else {
   echo
"Did you mean: $closest?\n";
}

?>

The above example will output:

Input word: carrrot
Did you mean: carrot?



add a noteadd a note User Contributed Notes
Calculate Levenshtein distance between two strings
There are no user contributed notes for this page.




<lcfirstlocaleconv>
Last updated: Thu, 26 Jun 2008
show source | credits | sitemap | contact | advertising | mirror sites
Copyright © 2001-2005 The PHP Group
All rights reserved.
This unofficial mirror is operated at: http://phpbuilder.com/
Last updated: Tue Nov 1 20:20:59 2005 EST
Columns / Articles | Tips / Quickies | News | News Linking and RSS Feeds | Shared Code Library
Mail Archives | Support / Discussion Forums | Get Started! Links | Contribute! | Docs