Python string data type

Python String Data Type Tutorial

In this tutorial we learn about strings as immutable collections of Unicode characters.

We cover their quoting, escape characters, concatenation, and why f-strings are the best formatting to use.

  • What is a string?
  • How to declare / initialize a string
  • The string collection
  • How to access characters in a string with the indexer
  • String Quotes
    • Single vs double quotes. When to use which
    • %-formatting
    • Don’t use %-formatting
    • .format() function
    • Don’t use the .format() function
    • f-Strings
    • f-String Expressions
    • Multi-line f-Strings
    • Use f-Strings instead of other string formatting

    What is a string?

    The string data type is one of the collection data types in Python. A string is an immutable collection of Unicode characters.

    How to declare/initialize a string

    Python allows us to easily declare strings by just wrapping letters, words or sentences in either single or double quotes.

    As mentioned before, a string is a collection of single characters.

    In some programming languages like C , we can’t use a string and have to define arrays of characters explicitly.

     In Python we can define a string directly. However, a string is still a collection of characters under the hood.

    Think of a string as a row in a table with each character in its own separate cell.

    If we consider the word “Hello”, this is what it would look like:

    Each character is also mapped to an index, a number that represents its position in the string.

    Considering the word “Hello” again, this is what it would look like:

    How to access characters in a string with the indexer

    We can use the index number of a character to access its value. We specify the index number of the character we want to access between [ ] (open and close square brackets).

     Example: string access via index
    In the example above we access each character of the string individually by using its index in the collection.

    note The index number of an indexed collection will always start at 0.

    String Quotes

    As mentioned, we may use both single or double quotes, but we may not use them together in the same initialization.

    A string may also not be initialized without quotes.

    Both single and double quotes are often used inside strings. If we don’t want to escape quote characters inside the string, we can simply use the opposite quotes as the string wrapper.

      In the example above, the string is enclosed with double quotes so we don’t need to explicitly escape the single quote.
      In the example above, we use double quotes inside the string so we simply wrap the whole string in single quotes.

    How to change a string value

    Strings are immutable and values cannot be changed at runtime. However, we can assign a new value to the same variable that holds a string.

    If we try to change a character inside the string, the interpreter will raise an error.

    In the example above we try to change the H character to a Y, but because a string is immutable the interpreter raises a TypeError.
     TypeError: 'str' object does not support item assignment

    If we want to change a string at runtime, we have to overwrite it with a new string completely.

    In the example above the old message is discarded and a new message is created with the same variable name but different string value.

    How to break a string in source code

    Sometimes in our source code we may need to break up a string onto multiple lines. Python doesn’t allow this in the same manner as other conventional languages (like C# ) do.

    If we use the example above in a Python script, the interpreter will produce a SyntaxError.
     SyntaxError: EOL while scanning string literal

    The interpreter encounters an End Of Line and assumes that the string should be closed there, but it isn’t.

    To break a string onto multiple lines in the source code, we use a \ (backslash) where we want the string to break to a new line.

    Both lines in the syntax example above are enclosed with their own quotes.

     \ In the example above the string is broken up into multiple lines in our source code, however, when we print the string it’s still on the same line.

    If we wanted to create new lines in print, we would have to use an escape character or triple quotes.

    String triple quotes

    Python’s triple quotes allow strings to span multiple lines. We can also include tabs and special characters without escaping them.

    To initialize a triple quote string we wrap our string in 3 single or double quotes.

      Example: triple quoted string
           In the example above, the string is printed exactly as it’s formatted in the source code.

    String escape characters

    If we’re not using triple quotes, we can escape certain characters with backslash notation.

     The following table lists some of the commonly used escape characters:
    Sequence Description Example Output
    \ Backslash print(’\’) \
    \’ Single quote ( ‘ ) print(’\“)
    \” Double quote ( “ ) print(”\“”)
    \n Line feed (new line) print(‘Hello \n World’) Hello
    World
    \t Horizontal tab print(“Hello \t World”) Hello World

    String Concatenation

    To combine, or concatenate, multiple strings together, we use the + (plus) operator.

     In the example above, we leave an extra space at the end of Hello as a separator between the words.

    String Formatting

    When we want to combine data into a string, Python won’t convert it automatically, we need some sort of string formatting. Fortunately we have several options:

    As an example let’s look at the following code:

      When we use the + operator on a string, the interpreter assumes we want to concatenate. And when we use it on an int , the interpreter assumes we want to do arithmetic.

    In the example above, the interpreter will get confused and raise a TypeError.

     TypeError: can only concatenate str (not "int") to str

    This is where string formatting comes to the rescue.

    %-formatting

    The original method to format a string was with the % (percent) operator. It’s placed within a string at the location we want our data to appear and the interpreter will then replace it with the specified data.

     The % operator is followed immediately by a character that denotes the type of data it is a placeholder for.
     In the example above we use an int as a value, so we use %i as the placeholder.

    The following table shows the characters to be used in string formatting:

    Character Description
    %c Character
    %s String conversion via str() prior to formatting
    %i Signed decimal integer
    %d Signed decimal integer
    %u Unsigned decimal integer
    %o Octal integer
    %x Hexadecimal integer using lowercase letters
    %X Hexadecimal integer using uppercase letters
    %e Exponential notation with lowercase e
    %E Exponential notation with uppercase e
    %f Floating point real number
    %g The shorthand of %f and %e
    %G The shorthand of %f and %E

    Don’t use %-formatting

    %-formatting isn’t great because it’s verbose and can lead to errors, like not displaying dictionaries correctly.

    Even the official Python documentation recommends not using %-formatting.

    .format() function

    Python 2.6 introduced a better way to format strings with the .format() function. The placeholder fields are marked with open and close curly braces and the fields we want to replace are then specified as function parameters.

       Example: string format() function
      In the example above we replace each instance of the open and close curly braces with a word inside the function’s parameters.

    We can also reference variables by using numbers to order them in the string.

        Example: order replacements by number in the format() function
      We can go a step further and insert the variable names giving us the perk of passing objects, then referencing their parameters and methods or use ** with dictionaries.

    We won’t demonstrate it here, but the point is that the .format() function is definitely a step up from %-formatting.

    Don’t use the .format() function

    The .format() function isn’t great because it is still quite verbose, specially when dealing with multiple parameters in longer strings.

    f-Strings

    Python 3.6 introduced us to f-Strings, or “formatted string literals”. f-Strings are string literals that have curly braces containing the expressions that will be replaced with their respective values. The expressions are formatted using the __format__ protocol.

    The syntax is similar to that of the .format() function but much less verbose. An f-String requires us to prefix the string with the letter f .

    Источник

    Читайте также:  Php echo несколько строк
Оцените статью