Thursday, February 4, 2010

Perl tips - Unicode

First, use Encode;

  • $string = Encode::decode('UTF-8',$text); (assuming the input file (or STDIN) is encoded in UTF-8).
  • You can now handle $string as you would normal strings (e.g. split(//) will split it at character boundaries)
  • Do $text = Encode::encode('UTF-8', $string); before writing it out to file (assuming you want the output file (or STDOUT) in that encoding).
  • \p{L} - full glyph (e.g. the letter 'A')
  • \p{M} - partial glyph (e.g. the accent ` on the letter 'A', giving 'À')
  • \p{N} - digit
  • \p{P} - punctuation
  • \p{kannada} - any Kannada character
  • \P{} - invert the condition
E.g. to match a line if it contains no numerals and no punctuation do
$line ~= m/\p{N}|\p{P}/

Backup and restore using SimpleBackup

I am using SimpleBackup for backing up my home directory. Download the utility and extract into some directory. Then do the following -

  • First, cd to that the SimpleBackup directory.
  • Run simplebackupconfig. You get a dialog where you can configure some stuff - mainly, the directories to be backed up, and the directories to be excluded from backup, and the directory where the backup files (which will be in .tar.gz form) should be stored.
  • Change the script bkup at line if [ $intdayofwk = 1 -o ! -e "$timestampfile" -o $lastfullbkage -gt 7 ]; to e.g. if [ ! -e "$timestampfile" -o $lastfullbkage -gt 30 ]; to do incremental backup unless it has been 30 days since the last full backup.
  • Decide on the directory or file to restore. cd to /. Run tar -xzf bkupfile.tar.gz --anchored "home/gtholpadi/Desktop/Files/some/dir".
  • This will create the directories home, gtholpadi etc. till dir, and also the directories in dir, if they don't already exist.
  • If extracted files already exist, they are overwritten.
  • Other files are kept intact.
  • Replace dir with file to extract file.

Update: I have now moved to rsync.

Wednesday, February 3, 2010

Perl tips - array of records, command line params, STDIN/STDOUT handles

set of records or 2D array
  • To store a set of records, use  my @data; $data[i]->[j] = $someval; (ith record, jth field). This basically works similar to a 2-D array except that each row can have different size.
  • To sort the records based on column j, do @sorteddata = sort {$a->[j] <=> $b->[j]} @data;.
Command line parameters
  • The format for specifying parameters is cmd OPT1=sth OPT2=sth ...
  • Use while(@ARGV){ if(m{OPT1=}){ use $'; }elsif(m{OPT2}){...}}.
  • Use $opstream = *STDOUT; (note the '*') to pass (both read and write) file handles around and use them via variables (e.g. print $opstream "qwe";).
  • To get floor of a number, do use POSIX; $val = POSIX::floor($num);.

Perl tips - creating your own utility library using classes

- Creating your own library of utility functions
  • Create a file
  • For each new class in it, start with package MyUtils::MyClass.
  • Create static class attributes using our $attr.
  • Create methods using sub meth1{ my @args = @_; ... }. If this will be used as an object method (not static), remember to treat first entry of @_ as reference to object.
  • Object is nothing but a normal data type (like hash or list) and is allocated in the new method (that you must define) and returned.
  • The new method should be like sub new {my ($class, $otherargs) = @_; my ;...; my $ref_to_somedata; bless , $class; return $ref_to_somedata;}.
  • Start with package MyUtils::MyClass::MySubClass to create a subclass.
Accessing your library
  • First include use lib "path/to/dir/containing/"; use MyUtils; in the beginning of the .pl file.
  • Use MyUtils::MyClass::meth1 to access static methods.
  • Use $MyUtils::MyClass::attr to access static attribute $attr of the class (note the $).
  • Use MyUtils::MyClass->new(...) to create an object.
  • Use MyUtils::MyClass::MySubClass wherever you are using MyUtils::MyClass for subclasses.
Other useful stuff
  • Use $ENV{env_variable} to access the value of environment variable $env_variable in Perl code, (useful for specifying "path/to/dir/" of .pm file).

Some LaTeX tips

- Useful commands
  • \footnote{} for adding footnotes in text (numbered)
  • \titlenote{} for adding footnotes to titles, subtitles etc. (*, dagger etc.)
  • \bibliographystyle{abbrv} to get short author names in the References
  • \bibliography{<.bib file>}
  • \fbox{} for enclosing text within a box 
  • To get a new line/line break after the paragraph title, use \paragraph{}\<space>\\
  • To hyperlink text, use \href{}{link text} (uses thepackage hyperref)
- About \begin{figure}
  • \begin{figure/table}[h] - the [h] will try to force the figure/table to appear at that place in the text
  • \epsfig{file=sth.eps, scale=0.25} - quick way to include .eps image.
  • Ensure that label comes immediately after caption, i.e. stick to the order: \caption{}\label{}\end{figure} when including figure. [Updated]
- Numbering parts of figures
  • Add \usepackage{subfig}
  • Put \subfloat[]{ \label{} \epsfig{}} \subfloat{...} ... within \begin{figure} ... \end{figure}
- About table/tabular
  • Use \begin{center}\begin{tabular}... if u dont need full-fledged table
  • If u have \begin{tabular}{|c|c||c|c|c|}, use \multicolumn{2}{|c||}{text} & \multicolumn{3}{|c|}to merge the first 2 and the next 3 columns.
  • Use longtable instead of tabular to allow the table to be split over two pages.
- For multiple authors, use this -
auth1 \and auth2 \and auth3 \and auth4 \and

\email{\texttt{\{auth1,auth2, auth3, auth4\}}}
-To spell check TeX files, use command ispell <.tex file>.

Some things I learnt about paper writing

Start writing sections with Experiments, work backwards till intro, then write conclusion, and then abstract and title.

- make it crisp!
- Abstract, Contributions, Conclusion.
  • State the exact work done in these sections. All three must be in sync.
  • Contributions: only what we have done, what is new (not how good it is); at the end, add a paragraph about paper organization.
- Experiments
  • First describe data, then the method, then the results.
- Highlighting
  • give names for systems you have built
  • make sure you highlight the good stuff (in boxes, by putting them in headings, etc.)
  • section headings should be informative; reader shoud be able to flip thru the paper and zero in on the crux of the paper
- Checks
  • double check notation - should be introduced properly, be simple, and consistent
  • Diagrams should stand on their own (reader should not have to go through the text to understand
  • give the right keywords and categories (will determine who will review the paper)
  • for categories, refer to ACM’S COMPUTING CLASSIFICATION SYSTEM
- Others
  • inline equations to save space
  • take the reader smoothly thru the paper; she should never get confused.
  • the first page is very important