# Portuguese confusion sets # License: GNU LESSER GENERAL PUBLIC LICENSE Version 2 # # Line format: # |; |; # optional comment # and are words that can easily be confused # will be used in the error message to explain the word (optional) # is the factor of how much more the other word must be more # probable so the text is considered potentially incorrect. # Use a higher value for better precision but lower recall. # Precision (p) and recall (r) values in the comments come from ConfusionRuleEvaluator # and are based on Wikipedia, Tatoeba, and Enron data, usually with 1000 random # sentences (if that many sentence are available for that pair). The number after recall # is the number of sentences used for evaluation. # Order is relevant for ambiguous cases like 'know' ('no' or 'now') where the match # is used whose pair comes first in this file. #