String Learner

Extract zipcode from addresses

See the example transformations below.

Enter the text you want to transform, separated by line.
Enter a few examples of the output. 1 to 10 examples should be enough. If you data is complex, you may need more examples.

The results will be here Add an example for an incorrect output and run it again.
Click and wait.

It may take up to a minute to transform your data if you have too many lines.

Transformation examples

Extract hostname from email addresses
Extract data from json file
Phone numbers formatting
Change date format without code
Extract zipcode from addresses
How to reverse first name and last name
How to capitalize first letter of names
Easy format mac addresses
Put quotes around string or number
Convert csv into json
Adhoc transformations

About String Learner

The task of transforming a string from a source format into a target format is encountered in many information-processing tasks. Consider the task of transforming a list of names in the form “firstname lastname” (e.g. “Michael Jackson”) into the target form “lastname first letter of the firstname” (e.g. “Jackson M”). In many domains, identifying an appropriate set of operations that transforms one string to another is challenging, as the space of possible transformations is large. In this work, we investigate the problem of learning string transformation rules from pairs of example strings. We propose a solid way to design these rules based on only four string operations: permutations, insertions, deletions and updates. Additionally, we propose an efficient algorithm to learn such rules, implemented as a combination of variations of well-known string manipulation algorithms. The proposed algorithm has the following desirable properties: it can express any string transformation; it can produce transformation rules that correctly transform a large part of the data, even when a limited number of training examples is provided; it is linear w.r.t the training sample size, which allows it to scale for large tasks; and it easy to understand and implement. We demonstrate the ability of our algorithm over real-world transformation tasks. The results indicate that the algorithm learns transformation rules that are generalizable for a broader range of strings to be transformed, using very few examples. The algorithm is especially useful for spreadsheets and data cleaning tool developers that want to support their end-users on string transformation related tasks.

© 2019-2020 - All right reserved.