Monday, February 20, 2012

Literal strings in Python

Some of the many ways one can construct strings in Python:

s = "qwe\nasd\nzxc"  #gives 'qwe\nasd\nzxc'
s = 'qwe\nasd\nzxc'  #gives 'qwe\nasd\nzxc'
s = 'qwe\
\nasd\
\nzxc'  #gives 'qwe\nasd\nzxc'
s = '''qwe
asd
zxc'''  #gives 'qwe\nasd\nzxc'
s = '''qwe\\n
asd
zxc'''  #gives 'qwe\\n\nasd\nzxc'
s = r'qwe\n\
asd\
zxc'  #gives 'qwe\\n\\\nasd\\\nzxc'; => \n in the string is always accompanied by a '\'.
s = u'qwe\u0020asd'  #gives a Unicode string 'qwe asd' (the Unicode code for space is 20)
s = ur'qwe\u0020asd'  #gives a Unicode string 'qwe asd'
s = ur'qwe\\u0020asd'  #gives a Unicode string 'qwe\\\\u0020asd' (the desired format for use in regular expressions.
s = u'qwe\u0020asd'.encode('utf-8')  #gives an encoding of the Unicode string in the specified encoding ('utf-8')
s1 = unicode(s, 'utf-8')  #gives a Unicode string after decoding the content of s according to specified encoding