Python strings


In this article, you will learn in depth about Python strings and different kind of functions and operations in strings.

Python String: Introduction
How to create a Python string?
How to concatenate strings in Python?
How to calculate length of a string in Python?
How to access characters in a Python string?
Iterating through a string in Python
Python string formatting: Escape Sequences
Python built-in methods for string manipulation
 python strings

Python strings: Introduction

string is a sequence of characters which is treated as a single data item. Python strings are array of characters or any group of characters defined between double quotes or single quotes.

For example, “This is a string” or ‘This is a string’.

The characters in the string are stored in the consecutive memory location as the string is treated as an array of characters. Each character occupies 1 byte in the memory heap.

Python strings are also immutable, meaning they cannot be modified once created.

Creating Python Strings

Creating strings in Python is as simple as assigning a value to the variable. The value that is being assigned must be in either within double quotes " " or single quotes ' ' for the string.

A string literal can be of multiple lines as well. In such cases, triple quotes are used.

Here are the examples to demonstrate Python strings.

>>> #using single quotes
>>> py_str = 'YOLO'
>>> print (py_str)
YOLO
>>> #using double quotes
>>> py_str = "Hey there"
>>> print (py_str)
Hey there
>>> #using triple quotes for multiple line strings
>>> py_str = """ This is first line
                 and this is second line."""
>>> print (py_str)
This is first line
          and this is second line.  

How to concatenate strings in Python?

Concatenation means joining two or more strings together.

In Python, this is achieved by using + operator.

>>> #concatenating two strings
>>> py_str = "Hi" + "there"
>>> print (py_str)
Hi there

>>> #concatenating multiple strings
>>> py_str  = "Hi" + "there" + "programmers"
>>> print (py_str)
Hi there programmers
>>> #using concatenating + and assignment operator =
>>> x = "Python"
>>> y = "Strings"
>>> x += y  #this is equivalent to x = x + y
>>> print(x)
Python Strings

How to calculate the length of a string?

Python has built-in function len(string_name) which calculates and return the length of a string supplied as an argument.

>>> #using len() to find length of a string
>>> py_str = "YOLO YOLO"
>>> print (len(py_str))  #to print the length of string
9

Here in the example, the length of the string is 9 because it also counts the whitespace in between.

How to access characters in a Python string?

Once we create Python string, we can’t modify them but we can access the characters in the string and use them in further operations.

In Python, individual characters or elements are accessed using indexing and a range of characters or string slices are accessed using slicing. Before jumping into examples, let us first know about indexing and slicing.

Before jumping into examples, let us first know about indexing and slicing.

What is indexing?

Indexing actually means locating an element from a Python sequence (list, string, tuple, dictionary or any) by its position in that sequence. This indexing starts from zero meaning the first element of the sequence holds the index '0'.

Python has indexing operator '[ ]'.

There are two ways we can index string or any sequence in Python:

  • index with positive integers: Index from left starting with 0 and 0 being the index of the first item in the string or sequence
  • index with negative integer: Index from the right starting with -1 and -1 being the index of the last item in the string or sequence.

For example, consider the following string with 4 characters.

py_str = "abcd"

Now, if we index with positive integers using the indexing operator '[ ]':

py_str[0] = apy_str[1] = bpy_str[2] = cpy_str[3] = d

And if we index with negative integers, then

py_str[-1] = dpy_str[-2] = cpy_str[-3] = bpy_str[-4] = a

Note: Only integers are allowed in indexing.

Common Programming Errors
Trying to access 5th character in the string of 4 characters will raise an error saying: IndexError: String index out of range. And also using floating point numbers instead of integers to index will also raise an error saying: TypeError

 

What is Slicing in Python?

Slicing as the meaning goes is basically chopping or retrieving a set of values from a particular sequence.

In slicing we define a starting point to start retrieving the values, an endpoint to stop retrieving (the endpoint value is not included) and the step size. Step size is by default 1, if not mentioned explicitly.

And the retrieved set of characters from the string is known as Python slices.

py_ssequence[start:end:step_size]

Now that we know about indexing and slicing, let’s learn about using slicing and indexing to access characters from Python strings with examples.

  • To access single character: Indexing
py_str[index_char]

Where index_char is the index of the character to be accessed from string py_str.

For example, if we have a string named py_str, we can access its 5th character by using py_str[4] as the indexing starts from 0.

  • To access a range of character or a segment of the string: Slicing
py_str[m,n]

This will return characters starting from mth character to the nth character from the string excluding nth character.

Example:

>>> py_str = "Programming"
>>> #accessing 4th character using positive integer
>>> print (py_str[3]) #remember indexing starts from 0
g
>>> #accessing 3rd character from right using negative integer
>>> print (py_str[-3])  #indexing from right starts with -1
i
>>> #slicing out 1st character to 7th character
>>> print (py_str[0:7]  #if not mentioned step size is 1
Progra
>>> #slicing 1st character to 5th last
>>> print (py_str[0:-4])
Program
>>> #to access whole string
>>> print (py_str[:])
Programming
>>> #to slice out whole string starting from 2nd character
>>> print (py_str[1:])
rogramming

Iterating through  the string in Python

Iterating through Python strings using for loop.

py_str = "Hey"
count = 1
for alphabets in py_str:
  print (alphabets)
  count = count+1

Output:

This will generate following output.

H
e
y

Now let’s write another example using while loop to check if a character is present in the string or not.

py_str = "trytoprogram"
l = len(py_str)
count = 0
while count<l:
  if py_str[count] == 'o':
     print ('character found')
     break
  count = count+1

Output:

This will generate following output.

character found

Python string formatting: Escape characters

An escape character is a non-printable character which is represented with a backslash '\' and are interpreted differently.

When do we need escape sequences?

Well, we need escape sequences in many things while coding. Here is one example.

In Python, a double quoted string literal can easily have a single quote in between and a single quote string can have a double quote in between.

But, what if we have both double quotes and single quotes in a string.

let us consider following example.

>>> print ("Gary said-"I don't like cats".")

SyntaxError : invalid syntax

That generated error called SyntaxError : invalid syntax.

That is because when the interpreter encountered closing double quote at I, it interpreted as the end of the string, hence generating the syntax error.

We can address such errors with triple quotes, but when triple quotes occur in between the string, it will again generate the error.

So, escape sequences are used to generate quotes in such cases. Here is the example demonstrating the solution using escape sequences.

>>> #using triple quotes
>>> print (""" Gary said-"I don't like cats".""" )
Gary said-"I don't like cats".
>>> #using escape sequences \" for printing double quotes and \' for single quotes
>>> print ("Gary said-\"I don\'t like cats\".")
Gary said-"I don't like cats".

Here is the list of escape sequences in Python.

Operator Meaning
\newline Newline ignored
\\ Backslash ( \ )
\’ Single quote ( ‘ )
\” Double quote ( ” )
\a ASCII bell or alert
\b ASCII backspace
\f ASCII form feed
\n ASCII line feed
\r ASCII carriage return
\t ASCII horizontal tab
\v ASCII vertical tab
\ooo ASCII character with octal value ooo
\xhh ASCII character with hexadecimal value hh

Python built-in methods for string manipulation

Python has many built-in functions or methods for the manipulation of strings. Here is the tabulated list of Python methods for string manipulation.

Python built-in methods with description
all( str)
It returns true if all the elements in the iterable are true.
any(str )
It returns true if any element in the iterable are true.
ascii(str )
It returns printable version of string ‘str’.
capitalize( )
It capitalizes the first letter of the string.
center( )
Returns a space-padded string with the specified character.
count(m)
Counts how many times m occurs in a string.
decode(encoding=’UTF-8′, errors=’strict’)
It decodes the string using the codec registered for encoding.
encode(encoding=’UTF-8′, errors=’strict’)
It returns encoded version of the string.
endswith(suffix)
Checks if the string ends with specified character or suffix.
expandtabs(tabsize)
Expands tabs in string with multiple spaces.
find(str)
Determine if ‘str’ is present in string or not.
index(str)
Returns the index of substring ‘str’ but raises an exception if ‘str’ not found.
isalnum( )
Returns true if string has at least 1 character and all characters are alphanumeric and false otherwise.
isalpha( )
Returns true if string has alphanumeric character.
isdigit( )
Checks if string contains only digits..
islower( )
Returns true if string has lowercase characters.
isnumeric( )
Returns true if the string contains only numeric digits.
isspace( )
It is used to check whitespace in the string.
istitle( )
Checks if a string is properly titlecased or not. Return true if titlecased.
isupper( )
Returns true if string has uppercase letters.
join(seq)
Returns concatenated string representations of elements in sequence ‘seq’.
len(str)
Returns the length of the string ‘str’.
ljust(width)
Returns a left-justified string of given width.
lower( )
It converts all uppercase letters in string to lowercase.
lstrip( )
Removes all leading characters or white spaces in the string.
maketrans( )
Returns a translation table that is used in translate function.
max(str)
It returns the max alphabet from string ‘str’.
min(str)
Returns the min alphabe from the string str.
replace(old, new)
It replaces ‘old’ sub-string with the ‘new’ string.
rfind(str, start,end)
It returns highest index of the sub-string.
rindex( str)
It is same as index( ), but search backwards in the string.
rjust(width)
It returns right-justified string of given width.
rstrip( )
It removes all trailing characters or white space of the string.
split(str)
It splits the string from the left.
splitlines( )
Splits all the new lines in the string.
startswith(str)
It checks whether the string starts with character ‘str’.
strip([chars])
It performs both lstrip() and rstrip() on the given string.
swapcase( )
It reverses uppercase into lowercase characters and vice-versa.
title( )
It returns the title cased string.
translate(table)
Translates string according to translation table acquired using maketrans( ) function.
upper( )
Converts lowercase letters to uppercase in the string.
zfill (width)
It returns a string padded with 0’s, width being the length of string padded with 0’s.
isdecimal( )
Returns true if a string contains only decimal characters.