First, use Encode;
Reading/Writing
E.g. to match a line if it contains no numerals and no punctuation doReading/Writing
- $string = Encode::decode('UTF-8',$text); (assuming the input file (or STDIN) is encoded in UTF-8).
- You can now handle $string as you would normal strings (e.g. split(//) will split it at character boundaries)
- Do $text = Encode::encode('UTF-8', $string); before writing it out to file (assuming you want the output file (or STDOUT) in that encoding).
- \p{L} - full glyph (e.g. the letter 'A')
- \p{M} - partial glyph (e.g. the accent ` on the letter 'A', giving 'À')
- \p{N} - digit
- \p{P} - punctuation
- \p{kannada} - any Kannada character
- \P{} - invert the condition
$line ~= m/\p{N}|\p{P}/
No comments:
Post a Comment