Popular Articles

  • Perl Empty Array (Jan 05, 2024)
  • Perl Change Directory (Jan 05, 2024)
  • Perl Backticks (Jan 05, 2024)
  • Perl Vs Bash (Jan 05, 2024)
  • Perl Or Operator (Jan 05, 2024)

Perl Initialize Array

Switch to English

Table of Contents

Introduction

Initializing arrays in perl, tips and tricks, common errors and how to avoid them.

  • Direct Assignment:
  • Assigning Range:
  • Initializing with qw Operator:
  • Understanding the Context:
  • Using Array Functions:
  • Off-by-One Errors:
  • Incorrect Context:
  • Not Initializing Properly:
No such thing as a small change
  
by (Paladin) ) NODE.title = Arrays: A Tutorial/Reference NODE.owner = 170442 N.title = monktitlebar sitedoclet N.owner = 17342 -->
( = : , )

is a type of Perl variable. An array variable is an ordered collection of any number (zero or more) of . Each element in an array has an which is a non-negative integer. Perl arrays are (like nearly everything else in the language) dynamic: , or generic, which is to say, an array doesn't know or enforce the type of its elements.

Actually, the values of Perl array elements can only be . This may sound like a limitation, if you think of scalars only as comprising numbers and strings; but a scalar can also be a to any perl data variable type – scalar, array, , etc. Therefore, by storing a reference to any data type in an array's elements, are possible. --> Other scalar types, such as and the special value, are also naturally allowed in the elements of an array.

So, given those characteristics of a Perl array, what kinds of things would you want to do with it? That is, what should be able to act on it? You might conceive different sets of operations, or , depending on how you expect to use an array in your program: or — that is, only working with its ends; table of scalars — that is, working with all of its elemental parts. Perl arrays can be used in all those ways, and more.

This tutorial focuses specifically on the array variable type. There are many things you can do in Perl with lists which will also work on arrays; for example, you can iterate over their contents using . Those things are not discussed here. Also:

Simple assignment does the job: The key points are that to the right hand side; The values are inserted in the array in the same order as they occur in the list, beginning with array index zero. For example, after executing element 0 will contain 'a', element 1 will contain 'b', and so on.

Whenever an array is assigned to like this, any contents it may have had before the assignment are removed!

Simply assign a zero-length list: Assigning a value such as , , or Rather, it will leave the array containing one element, with that one value. That is, are functionally identical.
Note that if your goal is to assign the one-element list to the array, omitting the parentheses is considered to be bad style, though technically they are not strictly necessary in this case.

To get the "length" or "size" of an array, simply use it in a . For example, you can "assign" the array to a scalar variable: and the scalar variable will afterwards contain the count of elements in the array. Other scalar contexts work as well: (Yes, gives its arguments list context, but the dot (string concatenation) operator takes precedence.)

You can always force scalar context on an array by using the function named : Note that this is a -only property; you cannot change the length of the array by assigning a scalar to the array variable. For example, does not empty the array (as stated in the previous section, ).

Often, you want to know what is the highest index in an array — that is, the index of its last element. Perl provides a special syntax for obtaining this value: This is useful, for example, when you want to create a list of all the indices in an array: #array) }

Unlike , a settable property. When you assign to an array's form, you cause its length (number of elements) to grow or shrink accordingly. If the length increases, the new elements will be uninitialized (that is, they'll be ). If the length decreases, elements will be dropped from the end. (Note, however, that perl dynamically sizes arrays, so forcing the length of an array like this is not something you'd normally need to do.)

Given that is assignable, you can clear an array by assigning -1 to its form. (Why -1? Well, that's what you see in if is empty.) Generally, this is not considered good style, but it's acceptable.

Another way to clear an array is . This technique should be used with caution, because it frees up some memory used internally to hold the elements. In most cases, this isn't worth the processing time. About the only situation in which you'd want to do this is if @array has a huge number of elements, and @array will be re-used after being cleared but will not hold a huge number of elements again.

Beware: As mentioned above in , assigning does clear an array. Unlike the case with scalars, and are not equivalent!

To get the entire list of values stored in an array at any given time, simply use it in a list context: This is useful for iterating over the list of values stored in an array, one at a time: This works because in the control construct, the stuff inside the parentheses is expected to be a list — or, more precisely, an expression which will be evaluated in list context and is expected to result in a list of (zero or more) scalar values.

What's the difference between these two lines of code:

Answer: or all in @array.
In the second, the array @x is set to a of @array.

-->>

The function to remove a single element from the end of an array is . Given the code: will contain and will be left with two elements, and .

Note: By "end", we mean the end of the array with the highest index.

Use the function to add a number of (scalar) values to the end of an array:

The function removes one value from the beginning of the array. That is, it removes (and returns) the value in element zero, and shifts all the rest of the elements down one, with the effect that the number of elements is decreased by one. Given the code: will contain and will be left with two elements, and . (You can see that is just like , but acts on the other end of the array.)

In a similarly analogous way, acts on the beginning of the array as acts on the end. Given: will contain

The first element of an array is accessed at index 0: Why the sigil? Remember that the elements of an array can only be scalar values. The makes sense here because we are accessing a single, scalar element out of the array. The thing inside the square brackets does not have to be an ; it can be any expression which results in a number. (If the resulting number is not an integer, it will be truncated to an integer (that is, rounded toward zero).

Change the value of the last element:

By analogy, if you want to access multiple elements at once, you would use the sigil instead of the . In addition, you would provide a list of index values within the square brackets, rather than just one. this syntax for accessing multiple elements of an array at once is called an .

Never forget that with an array slice the index expression is a list: it will be evaluated in list context, and can return any number (including zero) of index numbers. However many numbers are in the list of indices, that's how many elements will be included in the slice.

Beware, though: an array slice may like an array, due to the sigil, but it is not. For example, will yield the number of items in the slice!

Set the second, third, and fourth elements in an array:

We said earlier that array indices are non-negative integers. While this is strictly true at some level, perl conveniently lets you index elements from the of the array using negative indices. refers to the last element, to the next-to-last element, and so on. To oversimplify a bit, acts like an alias for ... !

So the following are equivalent: But beware: be written as: because in this situation the -1 is an argument of the range operator, which has no idea what "highest index number" is actually wanted.

Insert/Delete/Replace items in the middle of an array

It is possible to insert items into the middle of an array and remove items from the middle of an array. The function which enables this is called splice . It can insert items anywhere in an array (including the ends), and it can remove (and return) any sub-sequence of items from an array. In fact, it can do both of these at once: remove some sub-sequence of items and put another list of values in their place. splice always returns the list of removed values, if any.

The second argument of splice is an array index, and as such, everything we've said about indices applies to it.

The queue -like array functions could have been implemented in terms of splice , as follows: unshift @a, @b; # could be written as splice @a, 0, 0, @b; [download] push @a, @b; # could be written as splice @a, $#a+1, 0, @b; # we have to index to a position PAST the end + of array! [download] $b = shift @a; # could be written as $b = splice @a, 0, 1; [download] $b = pop @a; # could be written as $b = splice @a, -1, 1; [download] (Beware that in scalar context splice returns the last of the list of values removed; shift and pop always return the one value removed.)

Remove 3 items, beginning with the 3rd: @b = splice @a, 2, 3; [download] Insert some new values after the 3rd, without deleting any: splice @a, 2, 0, @b; [download] Replace the 4th and 5th items with three other values: splice @a, # array to modify 3, # starting with 4th item 2, # remove (replace) two items 'x', 'y', 'z'; # arbitrary list of new values to insert [download] And while we're at it: Clear an array - Round 3: @a = (); # could be written as splice @a, 0; [download]

Any Questions?

The Perl FAQ has a section on Arrays .

Related Resources

  • Shift, Pop, Unshift and Push with Impunity!
  • Multidimensional Arrays

What about wantarray ?

Despite its name, wantarray has nothing to do with arrays. It is misnamed. It should have been named something like detect_context . It is used inside subroutines to detect whether the sub is being called in list, scalar, or void context . It returns true, false, and undef in those cases, respectively.

Other possible topics:

  • tie ing arrays; the Tie::Array module
  • delete and how it doesn't work on arrays
  • exists and how it DOES work on arrays
  • Various related Perl FAQ entries
  • Array-related modules, such as those in the Array:: family
  • Traps/gotchas, such as deleting from an array while iterating over it
  • multidimensional arrays

If you have corrections or suggestions for changes to this tutorial, please /msg me if possible, rather than posting a reply. Thanks.

Arrays: A Tutorial/Reference or Code
Replies are listed 'Best First'.

Back to Tutorials

Password:

www . com | www . net | www . org

  • GrandFather
  • Seekers of Perl Wisdom
  • Cool Uses for Perl
  • Meditations
  • PerlMonks Discussion
  • Categorized Q&A
  • Obfuscated Code
  • Perl Poetry
  • PerlMonks FAQ
  • Guide to the Monastery
  • What's New at PerlMonks
  • Voting/Experience System
  • Other Info Sources
  • Nodes You Wrote
  • My Watched Nodes
  • Super Search
  • List Nodes By Users
  • Newest Nodes
  • Recently Active Threads
  • Selected Best Nodes
  • Worst Nodes
  • Saints in our Book
  • The St. Larry Wall Shrine
  • Offering Plate
  • Random Node
  • [id://781756|Buy PerlMonks Gear]
  • [Snippets Section|Snippets]
  • [Code Catacombs]
  • [Editor Requests]
  • blogs.perl.org
  • [http://planet.perl.org/|Planet Perl]
  • [http://ironman.enlightenedperl.org/|Perl Ironman Blog]
  • Perl Weekly
  • Perl Mongers
  • Perl Directory
  • Perl documentation
  • Today I Learned
-->
‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, .

Perl Onion

  •   AUTHORS
  •   CATEGORIES
  • #   TAGS

Perl hash basics: create, update, loop, delete and sort

Jun 16, 2013 by David Farrell

Hashes are one of Perl’s core data types. This article describes the main functions and syntax rules for for working with hashes in Perl.

Declaration and initialization

A hash is an unsorted collection of key value pairs. Within a hash a key is a unique string that references a particular value. A hash can be modified once initialized. Because a hash is unsorted, if it’s contents are required in a particular order then they must be sorted on output. Perl uses the ‘%’ symbol as the variable sigil for hashes. This command will declare an empty hash :

Similar to the syntax for arrays, hashes can also be declared using a list of comma separated values:

In the code above, Perl takes the first entry in the list as a key (‘monday’), and the second entry as that key’s value (65). The third entry in the list (‘tuesday’) would then be declared as a key, and the fourth entry (68) as its value and so on.

The ‘fat comma’ operator looks like an arrow (‘=>’) and allows the declaration of key value pairs instead of using a comma. This makes for cleaner and more readable code. Additionally there is no need to quote strings for keys when using the fat comma. Using the fat comma, the same declaration of %weekly_temperature would look like this:

Access a value

To access the value of a key value pair, Perl requires the key encased in curly brackets.

Note that strings do not need to be quoted when placed between the curly brackets for hash keys and that the scalar sigil (‘$’) is used when accessing a single scalar value instead of (‘%’).

Take a slice of a hash

A slice is a list of values. In Perl a slice can be read into an array, assigned to individual scalars, or used as an input to a function or subroutine. Slices are useful when you only want to extract a handful of values from a hash. For example:

The code above declares the ‘weekly_temperature’ hash as usual. What’s unusual here is that to get the slice of values, the array sigil (‘@’) is used by pre-pending it to the hash variable name. With this change the has will then lookup a list of values.

Access all values with the values function

The values function returns a list of the values contained in a hash. It’s possible to loop through the list of returned values and perform operations on them (e.g. print). For example:

A couple more tips when working with key value pairs of a hash: the code is more readable if you vertically align the fat comma (‘=>’) operators and unlike C, Perl allows the last element to have a trailing comma, which makes it easier to add elements later without generating a compile error.

Access all keys with the keys function

The keys function returns a list of the keys contained in a hash. A common way to access all the key value pairs of a hash is to use loop through the list returned by the keys function. For example:

In the above code we used the keys function to return a list of keys, looped though the list with foreach, and then printed the key and the value of each pair. Note that the print order of the pairs is different from intialization - that’s because hashes store their pairs in a random internal order. Also we used an interpreted quoted string using speech marks (“). This allows us to mix variables with plain text and escape characters like newline (’\n’) for convenient printing.

Access all key value pairs with the each function

The each function returns all keys and values of a hash, one at a time:

Add a new key value pair

To add a new pair to a hash, use this syntax:

Delete a key value pair

To remove a key value pair from a hash use the delete function. Delete requires the key of the pair in order to delete it from the hash:

Update a value of a pair

To update the value of a pair, simply assign it a new value using the same syntax as to add a new key value pair. The difference here is that the key already exists in the hash:

Empty a hash

To empty a hash, re-declare it with no members:

increment / decrement a value

Quick answer: use the same syntax for assigning / updating a value with the increment or decrement operator:

Sort a hash alphabetically

Although the internal ordering of a hash is random, it is possible to sort the output from a hash into a more useful sequence. Perl provides the sort function to (obviously) sort lists of data. By default it sorts alphabetically:

Let’s review the code above. The compare block receives the hash keys using the keys function. It then compares the values of each key using $a and $b as special variables to lookup and compare the values of the two keys. This sorted list of keys is then passed to the foreach command and looped through as usual. Note how the order is printed in value order - however it is still alphabetical ordering.

Sort a hash numerically

Numerically sorting a hash requires using a compare block as in the previous example, but substituting the ‘cmp’ operator for the numerical comparison operator (’<=>’):

Get the hash size

To get the size of a hash, simply call the keys function in a scalar context. This can be done by assigning the return value of keys to a scalar variable:

This article was originally posted on PerlTricks.com .

David Farrell

David is a professional programmer who regularly tweets and blogs about code and the art of programming.

Browse their articles

Something wrong with this article? Help us out by opening an issue or pull request on GitHub

To get in touch, send an email to [email protected] , or submit an issue to tpf/perldotcom on GitHub.

This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License .

Creative Commons License

Perl.com and the authors make no representations with respect to the accuracy or completeness of the contents of all work on this website and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. The information published on this website may not be suitable for every situation. All work on this website is provided with the understanding that Perl.com and the authors are not engaged in rendering professional services. Neither Perl.com nor the authors shall be liable for damages arising herefrom.

Home » Perl Hash

Summary : in this tutorial, you’ll learn about another compound data type called Perl hash and how to manipulate hash elements effectively.

Introduction to Perl hash

A Perl hash is defined by key-value pairs. Perl stores elements of a hash in such an optimal way that you can look up its values based on keys very fast.

With the array , you use indices to access its elements. However, you must use descriptive keys to access hash element. A hash is sometimes referred to as an associative array.

Like a scalar or an array variable , a hash variable has its own prefix. A hash variable must begin with a percent sign ( % ). The prefix % looks like key/value pair so remember this trick to name the hash variables.

The following example defines a simple hash.

To make the code easier to read, Perl provides the => operator as an alternative to a comma (,). It helps differentiate between keys and values and makes the code more elegant.

When you see the => operator, you know that you are dealing with a hash, not a list or an array .

The $countries hash can be rewritten using => operator as follows:

Perl requires the keys of a hash to be strings, meanwhile, the values can be any scalars. If you use non-string values as the keys, you may get an unexpected result.

In addition, a hash key must be unique. If you try to add a new key-value pair with the key that already exists, the value of the existing key will be over-written.

Notice that you can omit the quotation in the keys of the hash.

Perl hash operations

In the following section, we will examine the most commonly used operation in the hash.

Look up Perl hash values

You use a hash key inside curly brackets {} to look up a hash value. Take a look at the following example:

Add a new element

To add a new element to hash, you use a new key-pair value as follows:

Remove a single key/value pair

If you know the hash key, you can remove single key-value pair from the hash by using delete() function as follows:

Modify hash elements

You can modify the value of the existing key/value pair by assigning a new value as shown in the following example:

Loop over Perl hash elements

Perl provides the keys() function that allows you to get a list of keys in scalars. You can use the keys() function in a for loop statement to iterate the hash elements:

The keys() function returns a list of hash keys. The for loop visits each key and assigns it to a special variable $_ . Inside the loop, we access the value of a hash element via its key as $langs{$_} .

In this tutorial, you’ve learned about Perl hash and some techniques to manipulate hash elements.

The UNIX School

  • Shell Scripts
  • Different methods
  • awk & sed

Wednesday, December 5, 2012

10 examples of initializing a hash variable in perl, no comments:, post a comment.

  • Variable names
  • Identifier parsing
  • Scalar values
  • Demarcated variable names using braces
  • Special floating point: infinity (Inf) and not-a-number (NaN)
  • Version Strings
  • Special Literals
  • Array Interpolation
  • List value constructors
  • Multi-dimensional array emulation
  • Key/Value Hash Slices
  • Index/Value Array Slices
  • Typeglobs and Filehandles

perldata - Perl data types

# DESCRIPTION

# variable names.

Perl has three built-in data types: scalars, arrays of scalars, and associative arrays of scalars, known as "hashes". A scalar is a single string (of any size, limited only by the available memory), number, or a reference to something (which will be discussed in perlref ). Normal arrays are ordered lists of scalars indexed by number, starting with 0. Hashes are unordered collections of scalar values indexed by their associated string key.

Values are usually referred to by name, or through a named reference. The first character of the name tells you to what sort of data structure it refers. The rest of the name tells you the particular value to which it refers. Usually this name is a single identifier , that is, a string beginning with a letter or underscore, and containing letters, underscores, and digits. In some cases, it may be a chain of identifiers, separated by :: (or by the deprecated ' ); all but the last are interpreted as names of packages, to locate the namespace in which to look up the final identifier (see "Packages" in perlmod for details). For a more in-depth discussion on identifiers, see "Identifier parsing" . It's possible to substitute for a simple identifier, an expression that produces a reference to the value at runtime. This is described in more detail below and in perlref .

Perl also has its own built-in variables whose names don't follow these rules. They have strange names so they don't accidentally collide with one of your normal variables. Strings that match parenthesized parts of a regular expression are saved under names containing only digits after the $ (see perlop and perlre ). In addition, several special variables that provide windows into the inner working of Perl have names containing punctuation characters. These are documented in perlvar .

Scalar values are always named with '$', even when referring to a scalar that is part of an array or a hash. The '$' symbol works semantically like the English word "the" in that it indicates a single value is expected.

Entire arrays (and slices of arrays and hashes) are denoted by '@', which works much as the word "these" or "those" does in English, in that it indicates multiple values are expected.

Entire hashes are denoted by '%':

In addition, subroutines are named with an initial '&', though this is optional when unambiguous, just as the word "do" is often redundant in English. Symbol table entries can be named with an initial '*', but you don't really care about that yet (if ever :-).

Every variable type has its own namespace, as do several non-variable identifiers. This means that you can, without fear of conflict, use the same name for a scalar variable, an array, or a hash--or, for that matter, for a filehandle, a directory handle, a subroutine name, a format name, or a label. This means that $foo and @foo are two different variables. It also means that $foo[1] is a part of @foo, not a part of $foo. This may seem a bit weird, but that's okay, because it is weird.

Because variable references always start with '$', '@', or '%', the "reserved" words aren't in fact reserved with respect to variable names. They are reserved with respect to labels and filehandles, however, which don't have an initial special character. You can't have a filehandle named "log", for instance. Hint: you could say open(LOG,'logfile') rather than open(log,'logfile') . Using uppercase filehandles also improves readability and protects you from conflict with future reserved words. Case is significant--"FOO", "Foo", and "foo" are all different names. Names that start with a letter or underscore may also contain digits and underscores.

It is possible to replace such an alphanumeric name with an expression that returns a reference to the appropriate type. For a description of this, see perlref .

Names that start with a digit may contain only more digits. Names that do not start with a letter, underscore, digit or a caret are limited to one character, e.g., $% or $$ . (Most of these one character names have a predefined significance to Perl. For instance, $$ is the current process id. And all such names are reserved for Perl's possible use.)

# Identifier parsing

Up until Perl 5.18, the actual rules of what a valid identifier was were a bit fuzzy. However, in general, anything defined here should work on previous versions of Perl, while the opposite -- edge cases that work in previous versions, but aren't defined here -- probably won't work on newer versions. As an important side note, please note that the following only applies to bareword identifiers as found in Perl source code, not identifiers introduced through symbolic references, which have much fewer restrictions. If working under the effect of the use utf8; pragma, the following rules apply:

That is, a "start" character followed by any number of "continue" characters. Perl requires every character in an identifier to also match \w (this prevents some problematic cases); and Perl additionally accepts identifier names beginning with an underscore.

If not under use utf8 , the source is treated as ASCII + 128 extra generic characters, and identifiers should match

That is, any word character in the ASCII range, as long as the first character is not a digit.

There are two package separators in Perl: A double colon ( :: ) and a single quote ( ' ). Use of ' as the package separator is deprecated and will be removed in Perl 5.40. Normal identifiers can start or end with a double colon, and can contain several parts delimited by double colons. Single quotes have similar rules, but with the exception that they are not legal at the end of an identifier: That is, $'foo and $foo'bar are legal, but $foo'bar' is not.

Additionally, if the identifier is preceded by a sigil -- that is, if the identifier is part of a variable name -- it may optionally be enclosed in braces.

While you can mix double colons with singles quotes, the quotes must come after the colons: $::::'foo and $foo::'bar are legal, but $::'::foo and $foo'::bar are not.

Put together, a grammar to match a basic identifier becomes

Meanwhile, special identifiers don't follow the above rules; For the most part, all of the identifiers in this category have a special meaning given by Perl. Because they have special parsing rules, these generally can't be fully-qualified. They come in six forms (but don't use forms 5 and 6):

A sigil, followed solely by digits matching \p{POSIX_Digit} , like $0 , $1 , or $10000 .

A sigil followed by a single character matching the \p{POSIX_Punct} property, like $! or %+ , except the character "{" doesn't work.

A sigil, followed by a caret and any one of the characters [][A-Z^_?\] , like $^V or $^] .

Similar to the above, a sigil, followed by bareword text in braces, where the first character is a caret. The next character is any one of the characters [][A-Z^_?\] , followed by ASCII word characters. An example is ${^GLOBAL_PHASE} .

A sigil, followed by any single character in the range [\xA1-\xAC\xAE-\xFF] when not under "use utf8" . (Under "use utf8" , the normal identifier rules given earlier in this section apply.) Use of non-graphic characters (the C1 controls, the NO-BREAK SPACE, and the SOFT HYPHEN) has been disallowed since v5.26.0. The use of the other characters is unwise, as these are all reserved to have special meaning to Perl, and none of them currently do have special meaning, though this could change without notice.

Note that an implication of this form is that there are identifiers only legal under "use utf8" , and vice-versa, for example the identifier $état is legal under "use utf8" , but is otherwise considered to be the single character variable $é followed by the bareword "tat" , the combination of which is a syntax error.

This is a combination of the previous two forms. It is valid only when not under "use utf8" (normal identifier rules apply when under "use utf8" ). The form is a sigil, followed by text in braces, where the first character is any one of the characters in the range [\x80-\xFF] followed by ASCII word characters up to the trailing brace.

The same caveats as the previous form apply: The non-graphic characters are no longer allowed with "use utf8" , it is unwise to use this form at all, and utf8ness makes a big difference.

Prior to Perl v5.24, non-graphical ASCII control characters were also allowed in some situations; this had been deprecated since v5.20.

The interpretation of operations and values in Perl sometimes depends on the requirements of the context around the operation or value. There are two major contexts: list and scalar. Certain operations return list values in contexts wanting a list, and scalar values otherwise. If this is true of an operation it will be mentioned in the documentation for that operation. In other words, Perl overloads certain operations based on whether the expected return value is singular or plural. Some words in English work this way, like "fish" and "sheep".

In a reciprocal fashion, an operation provides either a scalar or a list context to each of its arguments. For example, if you say

the integer operation provides scalar context for the <> operator, which responds by reading one line from STDIN and passing it back to the integer operation, which will then find the integer value of that line and return that. If, on the other hand, you say

then the sort operation provides list context for <>, which will proceed to read every line available up to the end of file, and pass that list of lines back to the sort routine, which will then sort those lines and return them as a list to whatever the context of the sort was.

Assignment is a little bit special in that it uses its left argument to determine the context for the right argument. Assignment to a scalar evaluates the right-hand side in scalar context, while assignment to an array or hash evaluates the righthand side in list context. Assignment to a list (or slice, which is just a list anyway) also evaluates the right-hand side in list context.

When you use the use warnings pragma or Perl's -w command-line option, you may see warnings about useless uses of constants or functions in "void context". Void context just means the value has been discarded, such as a statement containing only "fred"; or getpwuid(0); . It still counts as scalar context for functions that care whether or not they're being called in list context.

User-defined subroutines may choose to care whether they are being called in a void, scalar, or list context. Most subroutines do not need to bother, though. That's because both scalars and lists are automatically interpolated into lists. See "wantarray" in perlfunc for how you would dynamically discern your function's calling context.

# Scalar values

All data in Perl is a scalar, an array of scalars, or a hash of scalars. A scalar may contain one single value in any of three different flavors: a number, a string, or a reference. In general, conversion from one form to another is transparent. Although a scalar may not directly hold multiple values, it may contain a reference to an array or hash which in turn contains multiple values.

Scalars aren't necessarily one thing or another. There's no place to declare a scalar variable to be of type "string", type "number", type "reference", or anything else. Because of the automatic conversion of scalars, operations that return scalars don't need to care (and in fact, cannot care) whether their caller is looking for a string, a number, or a reference. Perl is a contextually polymorphic language whose scalars can be strings, numbers, or references (which includes objects). Although strings and numbers are considered pretty much the same thing for nearly all purposes, references are strongly-typed, uncastable pointers with builtin reference-counting and destructor invocation.

A scalar value is interpreted as FALSE in the Boolean sense if it is undefined, the null string or the number 0 (or its string equivalent, "0"), and TRUE if it is anything else. The Boolean context is just a special kind of scalar context where no conversion to a string or a number is ever performed. Negation of a true value by ! or not returns a special false value. When evaluated as a string it is treated as "" , but as a number, it is treated as 0. Most Perl operators that return true or false behave this way.

There are actually two varieties of null strings (sometimes referred to as "empty" strings), a defined one and an undefined one. The defined version is just a string of length zero, such as "" . The undefined version is the value that indicates that there is no real value for something, such as when there was an error, or at end of file, or when you refer to an uninitialized variable or element of an array or hash. Although in early versions of Perl, an undefined scalar could become defined when first used in a place expecting a defined value, this no longer happens except for rare cases of autovivification as explained in perlref . You can use the defined() operator to determine whether a scalar value is defined (this has no meaning on arrays or hashes), and the undef() operator to produce an undefined value.

To find out whether a given string is a valid non-zero number, it's sometimes enough to test it against both numeric 0 and also lexical "0" (although this will cause noises if warnings are on). That's because strings that aren't numbers count as 0, just as they do in awk :

That method may be best because otherwise you won't treat IEEE notations like NaN or Infinity properly. At other times, you might prefer to determine whether string data can be used numerically by calling the POSIX::strtod() function or by inspecting your string with a regular expression (as documented in perlre ).

The length of an array is a scalar value. You may find the length of array @days by evaluating $#days , as in csh . However, this isn't the length of the array; it's the subscript of the last element, which is a different value since there is ordinarily a 0th element. Assigning to $#days actually changes the length of the array. Shortening an array this way destroys intervening values. Lengthening an array that was previously shortened does not recover values that were in those elements.

You can also gain some minuscule measure of efficiency by pre-extending an array that is going to get big. You can also extend an array by assigning to an element that is off the end of the array. You can truncate an array down to nothing by assigning the null list () to it. The following are equivalent:

If you evaluate an array in scalar context, it returns the length of the array. (Note that this is not true of lists, which return the last value, like the C comma operator, nor of built-in functions, which return whatever they feel like returning.) The following is always true:

Some programmers choose to use an explicit conversion so as to leave nothing to doubt:

If you evaluate a hash in scalar context, it returns a false value if the hash is empty. If there are any key/value pairs, it returns a true value. A more precise definition is version dependent.

Prior to Perl 5.25 the value returned was a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash. This is pretty much useful only to find out whether Perl's internal hashing algorithm is performing poorly on your data set. For example, you stick 10,000 things in a hash, but evaluating %HASH in scalar context reveals "1/16" , which means only one out of sixteen buckets has been touched, and presumably contains all 10,000 of your items. This isn't supposed to happen.

As of Perl 5.25 the return was changed to be the count of keys in the hash. If you need access to the old behavior you can use Hash::Util::bucket_ratio() instead.

If a tied hash is evaluated in scalar context, the SCALAR method is called (with a fallback to FIRSTKEY ).

You can preallocate space for a hash by assigning to the keys() function. This rounds up the allocated buckets to the next power of two:

# Scalar value constructors

Numeric literals are specified in any of the following floating point or integer formats:

You are allowed to use underscores (underbars) in numeric literals between digits for legibility (but not multiple underscores in a row: 23__500 is not legal; 23_500 is). You could, for example, group binary digits by threes (as for a Unix-style mode argument such as 0b110_100_100) or by fours (to represent nibbles, as in 0b1010_0110) or in other groups.

String literals are usually delimited by either single or double quotes. They work much like quotes in the standard Unix shells: double-quoted string literals are subject to backslash and variable substitution; single-quoted strings are not (except for \' and \\ ). The usual C-style backslash rules apply for making characters such as newline, tab, etc., as well as some more exotic forms. See "Quote and Quote-like Operators" in perlop for a list.

Hexadecimal, octal, or binary, representations in string literals (e.g. '0xff') are not automatically converted to their integer representation. The hex() and oct() functions make these conversions for you. See "hex" in perlfunc and "oct" in perlfunc for more details.

Hexadecimal floating point can start just like a hexadecimal literal, and it can be followed by an optional fractional hexadecimal part, but it must be followed by p , an optional sign, and a power of two. The format is useful for accurately presenting floating point values, avoiding conversions to or from decimal floating point, and therefore avoiding possible loss in precision. Notice that while most current platforms use the 64-bit IEEE 754 floating point, not all do. Another potential source of (low-order) differences are the floating point rounding modes, which can differ between CPUs, operating systems, and compilers, and which Perl doesn't control.

You can also embed newlines directly in your strings, i.e., they can end on a different line than they begin. This is nice, but if you forget your trailing quote, the error will not be reported until Perl finds another line containing the quote character, which may be much further on in the script. Variable substitution inside strings is limited to scalar variables, arrays, and array or hash slices. (In other words, names beginning with $ or @, followed by an optional bracketed expression as a subscript.) The following code segment prints out "The price is $100."

There is no double interpolation in Perl, so the $100 is left as is.

By default floating point numbers substituted inside strings use the dot (".") as the decimal separator. If use locale is in effect, and POSIX::setlocale() has been called, the character used for the decimal separator is affected by the LC_NUMERIC locale. See perllocale and POSIX .

# Demarcated variable names using braces

As in some shells, you can enclose the variable name in braces as a demarcator to disambiguate it from following alphanumerics and underscores or other text. You must also do this when interpolating a variable into a string to separate the variable name from a following double-colon or an apostrophe since these would be otherwise treated as a package separator:

Without the braces, Perl would have looked for a $whospeak, a $who::0 , and a $who's variable. The last two would be the $0 and the $s variables in the (presumably) non-existent package who .

In fact, a simple identifier within such curly braces is forced to be a string, and likewise within a hash subscript. Neither need quoting. Our earlier example, $days{'Feb'} can be written as $days{Feb} and the quotes will be assumed automatically. But anything more complicated in the subscript will be interpreted as an expression. This means for example that $version{2.0}++ is equivalent to $version{2}++ , not to $version{'2.0'}++ .

There is a similar problem with interpolation with text that looks like array or hash access notation. Placing a simple variable like $who immediately in front of text like "[1]" or "{foo}" would cause the variable to be interpolated as accessing an element of @who or a value stored in %who :

would attempt to access index 1 of an array named @who . Again, using braces will prevent this from happening:

will be treated the same as

This notation also applies to more complex variable descriptions, such as array or hash access with subscripts. For instance

Without the braces the above example would be parsed as a two level array subscript in the @name array, and under use strict would likely produce a fatal exception, as it would be parsed like this:

and not as the intended:

A similar result may be derived by using a backslash on the first character of the subscript or package notation that is not part of the variable you want to access. Thus the above example could also be written:

however for some special variables (multi character caret variables) the demarcated form using curly braces is the only way you can reference the variable at all, and the only way you can access a subscript of the variable via interpolation.

Consider the magic array @{^CAPTURE} which is populated by the regex engine with the contents of all of the capture buffers in a pattern (see perlvar and perlre ). The only way you can access one of these members inside of a string is via the braced (demarcated) form:

is equivalent to

Saying @^CAPTURE is a syntax error, so it must be referenced as @{^CAPTURE} , and to access one of its elements in normal code you would write ${^CAPTURE}[1] . However when interpolating in a string "${^CAPTURE}[1]" would be equivalent to ${^CAPTURE} . "[1]" , which does not even refer to the same variable! Thus the subscripts must also be placed inside of the braces: "${^CAPTURE[1]}" .

The demarcated form using curly braces can be used with all the different types of variable access, including array and hash slices. For instance code like the following:

would output

# Special floating point: infinity (Inf) and not-a-number (NaN)

Floating point values include the special values Inf and NaN , for infinity and not-a-number. The infinity can be also negative.

The infinity is the result of certain math operations that overflow the floating point range, like 9**9**9. The not-a-number is the result when the result is undefined or unrepresentable. Though note that you cannot get NaN from some common "undefined" or "out-of-range" operations like dividing by zero, or square root of a negative number, since Perl generates fatal errors for those.

The infinity and not-a-number have their own special arithmetic rules. The general rule is that they are "contagious": Inf plus one is Inf , and NaN plus one is NaN . Where things get interesting is when you combine infinities and not-a-numbers: Inf minus Inf and Inf divided by Inf are NaN (while Inf plus Inf is Inf and Inf times Inf is Inf ). NaN is also curious in that it does not equal any number, including itself: NaN != NaN .

Perl doesn't understand Inf and NaN as numeric literals, but you can have them as strings, and Perl will convert them as needed: "Inf" + 1. (You can, however, import them from the POSIX extension; use POSIX qw(Inf NaN); and then use them as literals.)

Note that on input (string to number) Perl accepts Inf and NaN in many forms. Case is ignored, and the Win32-specific forms like 1.#INF are understood, but on output the values are normalized to Inf and NaN .

# Version Strings

A literal of the form v1.20.300.4000 is parsed as a string composed of characters with the specified ordinals. This form, known as v-strings, provides an alternative, more readable way to construct strings, rather than use the somewhat less readable interpolation form "\x{1}\x{14}\x{12c}\x{fa0}" . This is useful for representing Unicode strings, and for comparing version "numbers" using the string comparison operators, cmp , gt , lt etc. If there are two or more dots in the literal, the leading v may be omitted.

Such literals are accepted by both require and use for doing a version check. Note that using the v-strings for IPv4 addresses is not portable unless you also use the inet_aton()/inet_ntoa() routines of the Socket package.

Note that since Perl 5.8.1 the single-number v-strings (like v65 ) are not v-strings before the => operator (which is usually used to separate a hash key from a hash value); instead they are interpreted as literal strings ('v65'). They were v-strings from Perl 5.6.0 to Perl 5.8.0, but that caused more confusion and breakage than good. Multi-number v-strings like v65.66 and 65.66.67 continue to be v-strings always.

# Special Literals

The special literals __FILE__, __LINE__, and __PACKAGE__ represent the current filename, line number, and package name at that point in your program. __SUB__ gives a reference to the current subroutine. They may be used only as separate tokens; they will not be interpolated into strings. If there is no current package (due to an empty package; directive), __PACKAGE__ is the undefined value. (But the empty package; is no longer supported, as of version 5.10.) Outside of a subroutine, __SUB__ is the undefined value. __SUB__ is only available in 5.16 or higher, and only with a use v5.16 or use feature "current_sub" declaration.

The two control characters ^D and ^Z, and the tokens __END__ and __DATA__ may be used to indicate the logical end of the script before the actual end of file. Any following text is ignored by the interpreter unless read by the program as described below.

Text after __DATA__ may be read via the filehandle PACKNAME::DATA , where PACKNAME is the package that was current when the __DATA__ token was encountered. The filehandle is left open pointing to the line after __DATA__. The program should close DATA when it is done reading from it. (Leaving it open leaks filehandles if the module is reloaded for any reason, so it's a safer practice to close it.) For compatibility with older scripts written before __DATA__ was introduced, __END__ behaves like __DATA__ in the top level script (but not in files loaded with require or do ) and leaves the remaining contents of the file accessible via main::DATA .

The DATA file handle by default has whatever PerlIO layers were in place when Perl read the file to parse the source. Normally that means that the file is being read bytewise, as if it were encoded in Latin-1, but there are two major ways for it to be otherwise. Firstly, if the __END__ / __DATA__ token is in the scope of a use utf8 pragma then the DATA handle will be in UTF-8 mode. And secondly, if the source is being read from perl's standard input then the DATA file handle is actually aliased to the STDIN file handle, and may be in UTF-8 mode because of the PERL_UNICODE environment variable or perl's command-line switches.

See SelfLoader for more description of __DATA__, and an example of its use. Note that you cannot read from the DATA filehandle in a BEGIN block: the BEGIN block is executed as soon as it is seen (during compilation), at which point the corresponding __DATA__ (or __END__) token has not yet been seen.

# Barewords

A word that has no other interpretation in the grammar will be treated as if it were a quoted string. These are known as "barewords". As with filehandles and labels, a bareword that consists entirely of lowercase letters risks conflict with future reserved words, and if you use the use warnings pragma or the -w switch, Perl will warn you about any such words. Perl limits barewords (like identifiers) to about 250 characters. Future versions of Perl are likely to eliminate these arbitrary limitations.

Some people may wish to outlaw barewords entirely. If you say

then any bareword that would NOT be interpreted as a subroutine call produces a compile-time error instead. The restriction lasts to the end of the enclosing block. An inner block may countermand this by saying no strict 'subs' .

# Array Interpolation

Arrays and slices are interpolated into double-quoted strings by joining the elements with the delimiter specified in the $" variable ( $LIST_SEPARATOR if "use English;" is specified), space by default. The following are equivalent:

Within search patterns (which also undergo double-quotish substitution) there is an unfortunate ambiguity: Is /$foo[bar]/ to be interpreted as /${foo}[bar]/ (where [bar] is a character class for the regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to array @foo)? If @foo doesn't otherwise exist, then it's obviously a character class. If @foo exists, Perl takes a good guess about [bar] , and is almost always right. If it does guess wrong, or if you're just plain paranoid, you can force the correct interpretation with curly braces as above.

If you're looking for the information on how to use here-documents, which used to be here, that's been moved to "Quote and Quote-like Operators" in perlop .

# List value constructors

List values are denoted by separating individual values by commas (and enclosing the list in parentheses where precedence requires it):

In a context not requiring a list value, the value of what appears to be a list literal is simply the value of the final element, as with the C comma operator. For example,

assigns the entire list value to array @foo, but

assigns the value of variable $bar to the scalar variable $foo. Note that the value of an actual array in scalar context is the length of the array; the following assigns the value 3 to $foo:

You may have an optional comma before the closing parenthesis of a list literal, so that you can say:

To use a here-document to assign an array, one line per element, you might use an approach like this:

LISTs do automatic interpolation of sublists. That is, when a LIST is evaluated, each element of the list is evaluated in list context, and the resulting list value is interpolated into LIST just as if each individual element were a member of LIST. Thus arrays and hashes lose their identity in a LIST--the list

contains all the elements of @foo followed by all the elements of @bar, followed by all the elements returned by the subroutine named SomeSub called in list context, followed by the key/value pairs of %glarch. To make a list reference that does NOT interpolate, see perlref .

The null list is represented by (). Interpolating it in a list has no effect. Thus ((),(),()) is equivalent to (). Similarly, interpolating an array with no elements is the same as if no array had been interpolated at that point.

This interpolation combines with the facts that the opening and closing parentheses are optional (except when necessary for precedence) and lists may end with an optional comma to mean that multiple commas within lists are legal syntax. The list 1,,3 is a concatenation of two lists, 1, and 3 , the first of which ends with that optional comma. 1,,3 is (1,),(3) is 1,3 (And similarly for 1,,,3 is (1,),(,),3 is 1,3 and so on.) Not that we'd advise you to use this obfuscation.

A list value may also be subscripted like a normal array. You must put the list in parentheses to avoid ambiguity. For example:

Lists may be assigned to only when each element of the list is itself legal to assign to:

An exception to this is that you may assign to undef in a list. This is useful for throwing away some of the return values of a function:

As of Perl 5.22, you can also use (undef)x2 instead of undef, undef . (You can also do ($x) x 2 , which is less useful, because it assigns to the same variable twice, clobbering the first value assigned.)

When you assign a list of scalars to an array, all previous values in that array are wiped out and the number of elements in the array will now be equal to the number of elements in the right-hand list -- the list from which assignment was made. The array will automatically resize itself to precisely accommodate each element in the right-hand list.

When, however, you assign a list of scalars to another list of scalars, the results differ according to whether the left-hand list -- the list being assigned to -- has the same, more or fewer elements than the right-hand list.

If the number of scalars in the left-hand list is less than that in the right-hand list, the "extra" scalars in the right-hand list will simply not be assigned.

If the number of scalars in the left-hand list is greater than that in the left-hand list, the "missing" scalars will become undefined.

List assignment in scalar context returns the number of elements produced by the expression on the right side of the assignment:

This is handy when you want to do a list assignment in a Boolean context, because most list functions return a null list when finished, which when assigned produces a 0, which is interpreted as FALSE.

It's also the source of a useful idiom for executing a function or performing an operation in list context and then counting the number of return values, by assigning to an empty list and then using that assignment in scalar context. For example, this code:

will place into $count the number of digit groups found in $string. This happens because the pattern match is in list context (since it is being assigned to the empty list), and will therefore return a list of all matching parts of the string. The list assignment in scalar context will translate that into the number of elements (here, the number of times the pattern matched) and assign that to $count. Note that simply using

would not have worked, since a pattern match in scalar context will only return true or false, rather than a count of matches.

The final element of a list assignment may be an array or a hash:

You can actually put an array or hash anywhere in the list, but the first one in the list will soak up all the values, and anything after it will become undefined. This may be useful in a my() or local().

A hash can be initialized using a literal list holding pairs of items to be interpreted as a key and a value:

While literal lists and named arrays are often interchangeable, that's not the case for hashes. Just because you can subscript a list value like a normal array does not mean that you can subscript a list value as a hash. Likewise, hashes included as parts of other lists (including parameters lists and return lists from functions) always flatten out into key/value pairs. That's why it's good to use references sometimes.

It is often more readable to use the => operator between key/value pairs. The => operator is mostly just a more visually distinctive synonym for a comma, but it also arranges for its left-hand operand to be interpreted as a string if it's a bareword that would be a legal simple identifier. => doesn't quote compound identifiers, that contain double colons. This makes it nice for initializing hashes:

or for initializing hash references to be used as records:

or for using call-by-named-parameter to complicated functions:

Note that just because a hash is initialized in that order doesn't mean that it comes out in that order. See "sort" in perlfunc for examples of how to arrange for an output ordering.

If a key appears more than once in the initializer list of a hash, the last occurrence wins:

This can be used to provide overridable configuration defaults:

# Subscripts

An array can be accessed one scalar at a time by specifying a dollar sign ( $ ), then the name of the array (without the leading @ ), then the subscript inside square brackets. For example:

The array indices start with 0. A negative subscript retrieves its value from the end. In our example, $myarray[-1] would have been 5000, and $myarray[-2] would have been 500.

Hash subscripts are similar, only instead of square brackets curly brackets are used. For example:

You can also subscript a list to get a single element from it:

# Multi-dimensional array emulation

Multidimensional arrays may be emulated by subscripting a hash with a list. The elements of the list are joined with the subscript separator (see "$;" in perlvar ).

The default subscript separator is "\034", the same as SUBSEP in awk .

A slice accesses several elements of a list, an array, or a hash simultaneously using a list of subscripts. It's more convenient than writing out the individual elements as a list of separate scalar values.

Since you can assign to a list of variables, you can also assign to an array or hash slice.

The previous assignments are exactly equivalent to

Since changing a slice changes the original array or hash that it's slicing, a foreach construct will alter some--or even all--of the values of the array or hash.

As a special exception, when you slice a list (but not an array or a hash), if the list evaluates to empty, then taking a slice of that empty list will always yield the empty list in turn. Thus:

This makes it easy to write loops that terminate when a null list is returned:

As noted earlier in this document, the scalar sense of list assignment is the number of elements on the right-hand side of the assignment. The null list contains no elements, so when the password file is exhausted, the result is 0, not 2.

Slices in scalar context return the last item of the slice.

If you're confused about why you use an '@' there on a hash slice instead of a '%', think of it like this. The type of bracket (square or curly) governs whether it's an array or a hash being looked at. On the other hand, the leading symbol ('$' or '@') on the array or hash indicates whether you are getting back a singular value (a scalar) or a plural one (a list).

# Key/Value Hash Slices

Starting in Perl 5.20, a hash slice operation with the % symbol is a variant of slice operation returning a list of key/value pairs rather than just values:

However, the result of such a slice cannot be localized or assigned to. These are otherwise very much consistent with hash slices using the @ symbol.

# Index/Value Array Slices

Similar to key/value hash slices (and also introduced in Perl 5.20), the % array slice syntax returns a list of index/value pairs:

Note that calling delete on array values is strongly discouraged.

# Typeglobs and Filehandles

Perl uses an internal type called a typeglob to hold an entire symbol table entry. The type prefix of a typeglob is a * , because it represents all types. This used to be the preferred way to pass arrays and hashes by reference into a function, but now that we have real references, this is seldom needed.

The main use of typeglobs in modern Perl is create symbol table aliases. This assignment:

makes $this an alias for $that, @this an alias for @that, %this an alias for %that, &this an alias for &that, etc. Much safer is to use a reference. This:

temporarily makes $Here::blue an alias for $There::green, but doesn't make @Here::blue an alias for @There::green, or %Here::blue an alias for %There::green, etc. See "Symbol Tables" in perlmod for more examples of this. Strange though this may seem, this is the basis for the whole module import/export system.

Another use for typeglobs is to pass filehandles into a function or to create new filehandles. If you need to use a typeglob to save away a filehandle, do it this way:

or perhaps as a real reference, like this:

See perlsub for examples of using these as indirect filehandles in functions.

Typeglobs are also a way to create a local filehandle using the local() operator. These last until their block is exited, but may be passed back. For example:

Now that we have the *foo{THING} notation, typeglobs aren't used as much for filehandle manipulations, although they're still needed to pass brand new file and directory handles into or out of functions. That's because *HANDLE{IO} only works if HANDLE has already been used as a handle. In other words, *FH must be used to create new symbol table entries; *foo{THING} cannot. When in doubt, use *FH .

All functions that are capable of creating filehandles (open(), opendir(), pipe(), socketpair(), sysopen(), socket(), and accept()) automatically create an anonymous filehandle if the handle passed to them is an uninitialized scalar variable. This allows the constructs such as open(my $fh, ...) and open(local $fh,...) to be used to create filehandles that will conveniently be closed automatically when the scope ends, provided there are no other references to them. This largely eliminates the need for typeglobs when opening filehandles that must be passed around, as in the following example:

Note that if an initialized scalar variable is used instead the result is different: my $fh='zzz'; open($fh, ...) is equivalent to open( *{'zzz'}, ...) . use strict 'refs' forbids such practice.

Another way to create anonymous filehandles is with the Symbol module or with the IO::Handle module and its ilk. These modules have the advantage of not hiding different types of the same name during the local(). See the bottom of "open" in perlfunc for an example.

See perlvar for a description of Perl's built-in variables and a discussion of legal variable names. See perlref , perlsub , and "Symbol Tables" in perlmod for more discussion on typeglobs and the *foo{THING} syntax.

Perldoc Browser is maintained by Dan Book ( DBOOK ). Please contact him via the GitHub issue tracker or email regarding any issues with the site itself, search, or rendering of documentation.

The Perl documentation is maintained by the Perl 5 Porters in the development of Perl. Please contact them via the Perl issue tracker , the mailing list , or IRC to report any issues with the contents or format of the documentation.

MarketSplash

How To Perform Perl Hash Manipulation For Data Handling

Hash manipulation in Perl provides developers with a versatile toolset for managing data. This article breaks down the essentials, from creation to best practices, ensuring a comprehensive grasp of Perl hashes for efficient coding.

Perl's hashes, often known as associative arrays, offer a straightforward way to store and organize data by key-value pairs. With their intuitive syntax and efficient storage mechanisms, they've become a staple in numerous scripts and applications. Familiarity with hash manipulation can significantly elevate your coding prowess in Perl.

perl invalid initialization by assignment

Understanding Perl Hash Basics

Creating and initializing hashes, accessing hash elements, modifying hash values, deleting elements from a hash, iterating over hashes, common hash functions and utilities, best practices for hash manipulation, frequently asked questions.

In the realm of Perl programming, a hash is a data structure that associates keys with values. Unlike arrays that use numerical indexes, hashes utilize strings (or other data types) as keys. This characteristic provides a more descriptive way to store and retrieve information.

Declaring A Hash

Checking for key existence.

To declare a hash in Perl, you use the percent symbol (%) . Each key-value pair within the hash is separated by a fat comma (=>) .

Accessing an element in a hash requires using its key. This is accomplished with curly braces {} .

If you wish to change a value associated with a key in a hash, simply assign a new value to it.

Perl provides a way to check if a certain key exists in a hash using the exists function.

By understanding these foundational aspects of Perl hashes, you can efficiently handle key-value pair operations in your scripts and applications.

Hashes, with their key-value pairs , are core to Perl's data structures. Let's delve into how to create and initialize them effectively.

Basic Hash Creation

Initialization using lists, using references.

To create a hash in Perl, you make use of the percent symbol (%) . The key and value within a pair are connected by the fat comma (=>) .

Hashes can also be initialized using two separate lists – one for keys and the other for values. The zip function from the List::Util module is helpful for this.

Perl also provides the ability to use hash references for more complex data structures. These references point to the memory location of the hash.

By mastering these methods, you pave the way for flexible and efficient data organization in your Perl scripts. Remember, the choice of method largely depends on the specific requirements of your application.

Efficiently navigating a Perl hash allows you to make the most of this data structure. Hash elements, paired as keys and values , can be accessed in several ways, each catering to different needs.

Direct Access

Accessing via references, iterating through elements, slice access.

The most straightforward method to fetch a value from a hash is by specifying its key. This is done using curly braces {} .

When working with hash references , the process of access differs slightly. The arrow operator (->) comes into play.

Often, you might want to loop through all elements of a hash. The each function enables this.

For times when you need to retrieve multiple values at once, Perl offers hash slices .

Armed with these methods, you can seamlessly navigate and manipulate any Perl hash, ensuring the right data is always within reach.

In dynamic applications, data isn't static. As such, the ability to modify hash values is essential. Whether you're updating, adding, or making bulk changes, Perl's hash structure supports seamless transformations.

Simple Value Update

Adding new elements, modifying through references, bulk modifications using hash slices.

Modifying an existing value in a hash is as straightforward as a variable assignment. Just reference the key whose value you intend to change.

If a key doesn't already exist in the hash, assigning a value to it will add that key-value pair.

For hash references , the process involves the arrow operator (->) .

To modify multiple hash elements simultaneously, hash slices are your best bet.

Being proficient in altering hash values ensures your data remains up-to-date, accommodating the ever-evolving requirements of your Perl programs.

When managing data structures in Perl, the need to remove specific keys or entirely reset a hash is a common task. Perl provides straightforward mechanisms for these actions, ensuring data integrity and optimization.

Single Key Deletion

Batch deletion, purging the entire hash, handling hash references.

To get rid of a specific key-value pair, employ the delete function . By targeting the key, you directly remove its associated value.

When multiple elements are redundant, hash slices become particularly useful.

Sometimes, a complete hash clearance is necessary. This can be accomplished with a simple operation.

When working with hash references , slight syntactical adjustments ensure successful deletions.

Maintaining a hash free from unnecessary entries promotes efficient data usage and optimal program performance.

Working with Perl hashes often means not just adding or accessing values, but also removing them. Perl offers intuitive mechanisms to handle such deletions, ensuring clean and efficient data structures.

Removing A Single Element

Wiping the entire hash, addressing hash references.

The delete function is your go-to method for removing a key-value pair. Specify the key, and both it and its corresponding value disappear.

Should you need to eliminate multiple elements at once, hash slices will be invaluable.

On occasions, you might need to start afresh. A straightforward assignment helps clear the hash.

Dealing with hash references requires a minor tweak in the syntax, but the essence remains the same.

Regular maintenance and removal of undesired entries from a hash guarantees streamlined data structures, fostering quicker operations and better application responsiveness.

One of the most common operations on a Perl hash is iteration. Whether you're exploring values, keys, or both, Perl provides elegant ways to loop through each element efficiently.

Iterating Over Keys

Iterating over values, iterating using each, foreach loops and hash slices.

The primary method to loop through all the keys of a hash is by employing the keys function .

When only the values matter, use the values function .

For a more compact iteration of both keys and values, each function stands out.

Sometimes, direct control over iteration is essential. Here, foreach loops and hash slices offer granular power.

Navigating through Perl hashes via iteration is a foundational skill, helping in data extraction, transformation, and inspection. With a grasp on these methods, you're primed to manage and utilize hashes effectively.

Hashes, the associative arrays in Perl, come equipped with a rich set of built-in functions and utilities. These tools not only provide ways to manage the data but also enhance hash-related operations.

Checking Existence With Exists

Determining size with scalars, deleting entries with delete, inverting hashes, clearing hashes with undef.

Want to see if a key exists in a hash? The exists function is your friend.

To know how many key-value pairs a hash has, a scalar context is employed.

If you need to remove a key-value pair, utilize the delete function .

Occasionally, swapping keys and values is essential. For this, a hash inversion is practical.

To entirely wipe a hash clean, an undef function serves right.

Harnessing these functions and utilities aids in efficient hash management, ensuring data is accessible, adaptable, and actionable. With them in your repertoire, you're set to make the most of Perl's hash capabilities.

Can Perl hashes store multiple values for a single key?

No, each key in a Perl hash is unique and maps to a single value. However, you can store a reference to an array or another hash as the value if you need multiple values for a single key.

How can I check if a specific key exists in a hash?

You can use the exists function. For example, to check if the key 'name' exists in %person hash, you'd use exists $person{'name'} .

How do I delete a key-value pair from a hash in Perl?

You can use the delete function. For instance, to remove the key 'age' and its associated value from %person hash, you'd write delete $person{'age'} .

What is the difference between an array and a hash in Perl?

The primary difference is how elements are accessed. In an array, elements are accessed using their index (a number), while in a hash, elements (values) are accessed using keys (typically strings). Arrays are ordered collections, while hashes are unordered.

Are Perl hash keys case-sensitive?

Yes, hash keys in Perl are case-sensitive. This means that the keys 'Name' and 'name' would refer to two different entries in the hash.

Let’s test your knowledge!

What function deletes an element from a Perl hash?

Subscribe to our newsletter, subscribe to be notified of new content on marketsplash..

  • Table of Contents
  • Modern Perl Books
  • The Perl Philosophy
  • Perl and Its Community
  • → The Perl Language ←
  • Regular Expressions and Matching
  • Style and Efficacy
  • Managing Real Programs
  • Perl Beyond Syntax
  • What to Avoid
  • What's Missing

This book is free!

Visit Modern Perl to download your own copy of this book. You can also buy a printed copy!

Modern Perl at Powell's Modern Perl at B&N Modern Perl at Amazon

© 2010-2012 chromatic

Published by Onyx Neon

The Perl Language

Like a spoken language, the whole of Perl is a combination of several smaller but interrelated parts. Unlike spoken language, where nuance and tone of voice and intuition allow people to communicate despite slight misunderstandings and fuzzy concepts, computers and source code require precision. You can write effective Perl code without knowing every detail of every language feature, but you must understand how they work together to write Perl code well.

Names (or identifiers ) are everywhere in Perl programs: variables, functions, packages, classes, and even filehandles. These names all begin with a letter or an underscore and may optionally include any combination of letters, numbers, and underscores. When the utf8 pragma ( Unicode and Strings ) is in effect, you may use any UTF-8 word characters in identifiers. These are all valid Perl identifiers:

These are invalid Perl identifiers:

Names exist primarily for the benefit of the programmer . These rules apply only to literal names which appear as-is in your source code, such as sub fetch_pie or my $waffleiron . Only Perl's parser enforces the rules about identifier names.

Perl's dynamic nature allows you to refer to entities with names generated at runtime or provided as input to a program. These symbolic lookups provide flexibility at the expense of some safety. In particular, invoking functions or methods indirectly or looking up symbols in a namespace lets you bypass Perl's parser.

Doing so can produce confusing code. As Mark Jason Dominus recommends so effectively http://perl.plover.com/varvarname.html , use a hash ( Hashes ) or nested data structure ( Nested Data Structures ).

Variable Names and Sigils

Variable names always have a leading sigil (or symbol) which indicates the type of the variable's value. Scalar variables ( Scalars ) use the dollar sign ( $ ). Array variables ( Arrays ) use the at sign ( @ ). Hash variables ( Hashes ) use the percent sign ( % ):

These sigils provide a visual namespacing for variable names. It's possible—though confusing—to declare multiple variables of the same name with different types:

Though Perl won't get confused, people reading this code will.

Perl 5's sigils are variant sigils . As context determines how many items you expect from an operation or what type of data you expect to get, so the sigil governs how you manipulate the data of a variable. For example, to access a single element of an array or a hash, you must use the scalar sigil ( $ ):

The parallel with amount context is important. Using a scalar element of an aggregate as an lvalue (the target of an assignment, on the left side of the = character) imposes scalar context ( Context ) on the rvalue (the value assigned, on the right side of the = character).

Similarly, accessing multiple elements of a hash or an array—an operation known as slicing —uses the at symbol ( @ ) and imposes list context ... even if the list itself has zero or one elements :

The most reliable way to determine the type of a variable—scalar, array, or hash—is to look at the operations performed on it. Scalars support all basic operations, such as string, numeric, and boolean manipulations. Arrays support indexed access through square brackets. Hashes support keyed access through curly brackets.

Perl provides a mechanism to group similar functions and variables into their own unique named spaces— namespaces ( Packages ). A namespace is a named collection of symbols. Perl allows multi-level namespaces, with names joined by double colons ( :: ), where DessertShop::IceCream refers to a logical collection of related variables and functions, such as scoop() and pour_hot_fudge() .

Within a namespace, you may use the short name of its members. Outside of the namespace, refer to a member using its fully-qualified name . That is, within DessertShop::IceCream , add_sprinkles() refers to the same function as does DessertShop::IceCream::add_sprinkles() outside of the namespace.

While standard naming rules apply to package names, by convention user-defined packages all start with uppercase letters. The Perl core reserves lowercase package names for core pragmas ( Pragmas ), such as strict and warnings . This is a policy enforced primarily by community guidelines.

All namespaces in Perl 5 are globally visible. When Perl looks up a symbol in DessertShop::IceCream::Freezer , it looks in the main:: symbol table for a symbol representing the DessertShop:: namespace, then in there for the IceCream:: namespace, and so on. The Freezer:: is visible from outside of the IceCream:: namespace. The nesting of the former within the latter is only a storage mechanism, and implies nothing further about relationships between parent and child or sibling packages. Only a programmer can make logical relationships between entities obvious—by choosing good names and organizing them well.

A variable in Perl is a storage location for a value ( Values ). While a trivial program can manipulate values directly, most programs work with variables to simplify the logic of the code. A variable represents values; it's easier to explain the Pythagorean theorem in terms of the variables a , b , and c than by intuiting its principle by producing a long list of valid values. This concept may seem basic, but effective programming requires you to manage the art of balancing the generic and reusable with the specific.

Variable Scopes

Variables are visible to portions of your program depending on their scope ( Scope ). Most of the variables you will encounter have lexical scope ( Lexical Scope ). Files themselves provide their own lexical scopes, such that the package declaration on its own does not create a new scope:

As of Perl 5.14, you may provide a block to the package declaration. This syntax does provide a lexical scope:

Variable Sigils

The sigil of the variable in a declaration determines the type of the variable: scalar, array, or hash. The sigil used when accessing a variable varies depending on what you do to the variable. For example, you declare an array as @values . Access the first element—a single value—of the array with $values[0] . Access a list of values from the array with @values[ @indices ] .

Anonymous Variables

Perl variables do not require names. Names exist to help you, the programmer, keep track of an $apple , @barrels , or %cheap_meals . Variables created without literal names in your source code are anonymous variables. The only way to access anonymous variables is by reference ( References ).

Variables, Types, and Coercion

A variable in Perl 5 represents both a value (a dollar cost, available pizza toppings, guitar shops with phone numbers) and the container which stores that value. Perl's type system deals with value types and container types . While a variable's container type —scalar, array, or hash—cannot change, Perl is flexible about a variable's value type. You may store a string in a variable in one line, append to that variable a number on the next, and reassign a reference to a function ( Function References ) on the third.

Performing an operation on a variable which imposes a specific value type may cause coercion ( Coercion ) from the variable's existing value type.

For example, the documented way to determine the number of entries in an array is to evaluate that array in scalar context ( Context ). Because a scalar variable can only ever contain a scalar, assigning an array to a scalar imposes scalar context on the operation, and an array evaluated in scalar context returns the number of elements in the array:

This relationship between variable types, sigils, and context is essential.

The structure of a program depends heavily on the means by which you model your data with appropriate variables.

Where variables allow the abstract manipulation of data, the values they hold make programs concrete and useful. The more accurate your values, the better your programs. These values are data—your aunt's name and address, the distance between your office and a golf course on the moon, or the weight of all of the cookies you've eaten in the past year. Within your program, the rules regarding the format of that data are often strict. Effective programs need effective (simple, fast, most compact, most efficient) ways of representing their data.

A string is a piece of textual or binary data with no particular formatting or contents. It could be your name, the contents of an image file, or your program itself. A string has meaning in the program only when you give it meaning.

To represent a literal string in your program, surround it with a pair of quoting characters. The most common string delimiters are single and double quotes:

Characters in a single-quoted string represent themselves literally, with two exceptions. Embed a single quote inside a single-quoted string by escaping the quote with a leading backslash:

You must also escape any backslash at the end of the string to avoid escaping the closing delimiter and producing a syntax error:

Any other backslash will be part of the string as it appears, unless two backslashes are adjacent, in which case the first will escape the second:

A double-quoted string has several more special characters available. For example, you may encode otherwise invisible whitespace characters in the string:

This demonstrates a useful principle: the syntax used to declare a string may vary. You can represent a tab within a string with the \t escape or by typing a tab directly. Within Perl's purview, both strings behave the same way, even though the specific representation of the string may differ in the source code.

A string declaration may cross logical newlines; these two declarations are equivalent:

These sequences are often easier to read than their whitespace equivalents.

Perl strings have variable lengths. As you manipulate and modify strings, Perl will change their sizes as appropriate. For example, you can combine multiple strings into a larger string with the concatenation operator . :

This is effectively the same as if you'd initialized the string all at once.

You may also interpolate the value of a scalar variable or the values of an array within a double-quoted string, such that the current contents of the variable become part of the string as if you'd concatenated them:

Include a literal double-quote inside a double-quoted string by escaping it (that is, preceding it with a leading backslash):

When repeated backslashing becomes unwieldy, use an alternate quoting operator by which you can choose an alternate string delimiter. The q operator indicates single quoting, while the qq operator provides double quoting behavior. The character immediately following the operator determines the characters used to delimit the strings. If the character is the opening character of a balanced pair—such as opening and closing braces—the closing character will be the final delimiter. Otherwise, the character itself will be both the starting and ending delimiter.

When declaring a complex string with a series of embedded escapes is tedious, use the heredoc syntax to assign one or more lines of a string:

The <<'END_BLURB' syntax has three parts. The double angle-brackets introduce the heredoc. The quotes determine whether the heredoc obeys single- or double-quoted behavior. The default behavior is double-quoted interpolation. END_BLURB is an arbitrary identifier which the Perl 5 parser uses as the ending delimiter.

Be careful; regardless of the indentation of the heredoc declaration itself, the ending delimiter must start at the beginning of the line:

If the identifier begins with whitespace, that same whitespace must be present before the ending delimiter. Yet if you indent the identifier, Perl 5 will not remove equivalent whitespace from the start of each line of the heredoc.

Using a string in a non-string context will induce coercion ( Coercion ).

Unicode and Strings

Unicode is a system for representing the characters of the world's written languages. While most English text uses a character set of only 127 characters (which requires seven bits of storage and fits nicely into eight-bit bytes), it's naïve to believe that you won't someday need an umlaut.

Perl 5 strings can represent either of two separate but related data types:

Sequences of Unicode characters

Each character has a codepoint , a unique number which identifies it in the Unicode character set.

Sequences of octets

Binary data is a sequence of octets —8 bit numbers, each of which can represent a number between 0 and 255.

Words Matter

Why octet and not byte ? Assuming that one character fits in one byte will cause you no end of Unicode grief. Separate the idea of memory storage from character representation.

Unicode strings and binary strings look similar. Each has a length() . Each supports standard string operations such as concatenation, splicing, and regular expression processing. Any string which is not purely binary data is textual data, and should be a sequence of Unicode characters.

However, because of how your operating system represents data on disk or from users or over the network—as sequences of octets—Perl can't know if the data you read is an image file or a text document or anything else. By default, Perl treats all incoming data as sequences of octets. You must add a specific meaning to that data.

Character Encodings

A Unicode string is a sequence of octets which represents a sequence of characters. A Unicode encoding maps octet sequences to characters. Some encodings, such as UTF-8, can encode all of the characters in the Unicode character set. Other encodings represent a subset of Unicode characters. For example, ASCII encodes plain English text with no accented characters, while Latin-1 can represent text in most languages which use the Latin alphabet.

To avoid most Unicode problems, always decode to and from the appropriate encoding at the inputs and outputs of your program.

An Evolving Standard

Perl 5.12 supports the Unicode 5.2 standard, while Perl 5.14 supports Unicode 6.0. If you need to care about the differences between Unicode versions, you probably already know to see http://unicode.org/versions/ .

Unicode in Your Filehandles

When you tell Perl that a specific filehandle ( Files ) works with encoded text, Perl will convert the incoming octets to Unicode strings automatically. To do this, add an IO layer to the mode of the open builtin. An IO layer wraps around input or output and converts the data. In this case, the :utf8 layer decodes UTF-8 data:

You may also modify an existing filehandle with binmode , whether for input or output:

Without the utf8 mode, printing Unicode strings to a filehandle will result in a warning ( Wide character in %s ), because files contain octets, not Unicode characters.

Unicode in Your Data

The core module Encode provides a function named decode() to convert a scalar containing data to a Unicode string. The corresponding encode() function converts from Perl's internal encoding to the desired output encoding:

Unicode in Your Programs

You may include Unicode characters in your programs in three ways. The easiest is to use the utf8 pragma ( Pragmas ), which tells the Perl parser to interpret the rest of the source code file with the UTF-8 encoding. This allows you to use Unicode characters in strings and identifiers:

To write this code, your text editor must understand UTF-8 and you must save the file with the appropriate encoding.

Within double-quoted strings, you may use the Unicode escape sequence to represent character encodings. The syntax \x{} represents a single character; place the hex form of the character's Unicode number within the curly brackets:

Some Unicode characters have names, and these names are often clearer to read than Unicode numbers. Use the charnames pragma to enable them and the \N{} escape to refer to them:

You may use the \x{} and \N{} forms within regular expressions as well as anywhere else you may legitimately use a string or a character.

Implicit Conversion

Most Unicode problems in Perl arise from the fact that a string could be either a sequence of octets or a sequence of characters. Perl allows you to combine these types through the use of implicit conversions. When these conversions are wrong, they're rarely obviously wrong.

When Perl concatenates a sequences of octets with a sequence of Unicode characters, it implicitly decodes the octet sequence using the Latin-1 encoding. The resulting string will contain Unicode characters. When you print Unicode characters, Perl will encode the string using UTF-8, because Latin-1 cannot represent the entire set of Unicode characters—Latin-1 is a subset of UTF-8.

This asymmetry can lead to Unicode strings encoded as UTF-8 for output and decoded as Latin-1 when input.

Worse yet, when the text contains only English characters with no accents, the bug hides—because both encodings have the same representation for every character.

If $name contains an English name such as Alice you will never notice any problem, because the Latin-1 representation is the same as the UTF-8 representation. If $name contains a name such as José , $name can contain several possible values:

  • $name contains four Unicode characters.
  • $name contains four Latin-1 octets representing four Unicode characters.
  • $name contains five UTF-8 octets representing four Unicode characters.

The string literal has several possible scenarios:

  • It is an ASCII string literal and contains octets. my $hello = "Hello, ";

The string literal contains octets.

  • It is a non-ASCII string literal with the utf8 or encoding pragma in effect and contains Unicode characters. use utf8; my $hello = "Kuirabá, ";

If both $hello and $name are Unicode strings, the concatenation will produce another Unicode string.

If both strings are octet streams, Perl will concatenate them into a new octet string. If both values are octets of the same encoding—both Latin-1, for example, the concatenation will work correctly. If the octets do not share an encoding, for example a concatenation appending UTF-8 data to Latin-1 data, then the resulting sequence of octets makes sense in neither encoding. This could happen if the user entered a name as UTF-8 data and the greeting were a Latin-1 string literal, but the program decoded neither.

If only one of the values is a Unicode string, Perl will decode the other as Latin-1 data. If this is not the correct encoding, the resulting Unicode characters will be wrong. For example, if the user input were UTF-8 data and the string literal were a Unicode string, the name would be incorrectly decoded into five Unicode characters to form José ( sic ) instead of José because the UTF-8 data means something else when decoded as Latin-1 data.

See perldoc perluniintro for a far more detailed explanation of Unicode, encodings, and how to manage incoming and outgoing data in a Unicode world For far more detail about managing Unicode effectively throughout your programs, see Tom Christiansen's answer to "Why does Modern Perl avoid UTF-8 by default?" http://stackoverflow.com/questions/6162484/why-does-modern-perl-avoid-utf-8-by-default/6163129#6163129 .

Perl 5.12 added a feature, unicode_strings , which enables Unicode semantics for all string operations within its scope. Perl 5.14 improved this feature; if you work with Unicode in Perl, it's worth upgrading to at least Perl 5.14.

Perl supports numbers as both integers and floating-point values. You may represent them with scientific notation as well as in binary, octal, and hexadecimal forms:

The emboldened characters are the numeric prefixes for binary, octal, and hex notation respectively. Be aware that a leading zero on an integer always indicates octal mode.

When 1.99 + 1.99 is 4

Even though you can write floating-point values explicitly in Perl 5 with perfect accuracy, Perl 5 stores them internally in a binary format. This representation is sometimes imprecise in specific ways; consult perldoc perlnumber for more details.

You may not use commas to separate thousands in numeric literals, lest the parser interpret the commas as comma operators. Instead, use underscores within the number. The parser will treat them as invisible characters; your readers may not. These are equivalent:

Consider the most readable alternative.

Because of coercion ( Coercion ), Perl programmers rarely have to worry about converting text read from outside the program to numbers. Perl will treat anything which looks like a number as a number in numeric contexts. In the rare circumstances where you need to know if something looks like a number to Perl, use the looks_like_number function from the core module Scalar::Util . This function returns a true value if Perl will consider the given argument numeric.

The Regexp::Common module from the CPAN provides several well-tested regular expressions to identify more specific valid types (whole number, integer, floating-point value) of numeric values.

Perl 5's undef value represents an unassigned, undefined, and unknown value. Declared but undefined scalar variables contain undef :

undef evaluates to false in boolean context. Evaluating undef in a string context—such as interpolating it into a string—produces an uninitialized value warning:

... produces:

The defined builtin returns a true value if its operand evaluates to a defined value (anything other than undef ):

The Empty List

When used on the right-hand side of an assignment, the () construct represents an empty list. In scalar context, this evaluates to undef . In list context, it is an empty list. When used on the left-hand side of an assignment, the () construct imposes list context. To count the number of elements returned from an expression in list context without using a temporary variable, use the idiom ( Idioms ):

Because of the right associativity ( Associativity ) of the assignment operator, Perl first evaluates the second assignment by calling get_all_clown_hats() in list context. This produces a list.

Assignment to the empty list throws away all of the values of the list, but that assignment takes place in scalar context, which evaluates to the number of items on the right hand side of the assignment. As a result, $count contains the number of elements in the list returned from get_all_clown_hats() .

If you find that concept confusing right now, fear not. As you understand how Perl's fundamental design features fit together in practice, it will make more sense.

A list is a comma-separated group of one or more expressions. Lists may occur verbatim in source code as values:

... as targets of assignments:

... or as lists of expressions:

Parentheses do not create lists. The comma operator creates lists. Where present, the parentheses in these examples group expressions to change their precedence ( Precedence ).

Use the range operator to create lists of literals in a compact form:

Use the qw() operator to split a literal string on whitespace to produce a list of strings:

No Comment Please

Perl will emit a warning if a qw() contains a comma or the comment character ( # ), because not only are such characters rare in a qw() , their presence usually indicates an oversight.

Lists can (and often do) occur as the results of expressions, but these lists do not appear literally in source code.

Lists and arrays are not interchangeable in Perl. Lists are values. Arrays are containers. You may store a list in an array and you may coerce an array to a list, but they are separate entities. For example, indexing into a list always occurs in list context. Indexing into an array can occur in scalar context (for a single element) or list context (for a slice):

Control Flow

Perl's basic control flow is straightforward. Program execution starts at the beginning (the first line of the file executed) and continues to the end:

Perl's control flow directives change the order of execution—what happens next in the program—depending on the values of their expressions.

Branching Directives

The if directive performs the associated action only when its conditional expression evaluates to a true value:

This postfix form is useful for simple expressions. A block form groups multiple expressions into a single unit:

While the block form requires parentheses around its condition, the postfix form does not.

The conditional expression may consist of multiple subexpressions, as long as it evaluates to a single top-level expression:

In the postfix form, adding parentheses can clarify the intent of the code at the expense of visual cleanliness:

The unless directive is a negated form of if . Perl will perform the action when the conditional expression evaluates to false :

Like if , unless also has a block form, though many programmers avoid it, as it rapidly becomes difficult to read with complex conditionals:

unless works very well for postfix conditionals, especially parameter validation in functions ( Postfix Parameter Validation ):

The block forms of if and unless both work with the else directive, which provides code to run when the conditional expression does not evaluate to true (for if ) or false (for unless ):

else blocks allow you to rewrite if and unless conditionals in terms of each other:

However, the implied double negative of using unless with an else block can be confusing. This example may be the only place you ever see it.

Just as Perl provides both if and unless to allow you to phrase your conditionals in the most readable way, you can choose between positive and negative conditional operators:

... though the double negative implied by the presence of the else block suggests inverting the conditional.

One or more elsif directives may follow an if block form and may precede any single else :

An unless chain may also use an elsif block Good luck deciphering that! . There is no elseunless .

Writing else if is a syntax error Larry prefers elsif for aesthetic reasons, as well the prior art of the Ada programming language. :

The Ternary Conditional Operator

The ternary conditional operator evaluates a conditional expression and produces one of two alternatives:

The conditional expression precedes the question mark character ( ? ) and the colon character ( : ) separates the alternatives. The alternatives are expressions of arbitrary complexity—including other ternary conditional expressions.

An interesting, though obscure, idiom is to use the ternary conditional to select between alternative variables , not only values:

Again, weigh the benefits of clarity versus the benefits of conciseness.

Short Circuiting

Perl exhibits short-circuiting behavior when it encounters complex conditional expressions. When Perl can determine that a complex expression would succeed or fail as a whole without evaluating every subexpression, it will not evaluate subsequent subexpressions. This is most obvious with an example:

The return value of ok() ( Testing ) is the boolean value obtained by evaluating the first argument, so this code prints:

When the first subexpression—the first call to ok —evaluates to a true value, Perl must evaluate the second subexpression. If the first subexpression had evaluated to a false value, there would be no need to check subsequent subexpressions, as the entire expression could not succeed:

This example prints:

Even though the second subexpression would obviously succeed, Perl never evaluates it. The same short-circuiting behavior is evident for logical-or operations:

With the success of the first subexpression, Perl can avoid evaluating the second subexpression. If the first subexpression were false, the result of evaluating the second subexpression would dictate the result of evaluating the entire expression.

Besides allowing you to avoid potentially expensive computations, short circuiting can help you to avoid errors and warnings, as in the case where using an undefined value might raise a warning:

Context for Conditional Directives

The conditional directives— if , unless , and the ternary conditional operator—all evaluate an expression in boolean context ( Context ). As comparison operators such as eq , == , ne , and != all produce boolean results when evaluated, Perl coerces the results of other expressions—including variables and values—into boolean forms.

Perl 5 has no single true value, nor a single false value. Any number which evaluates to 0 is false. This includes 0 , 0.0 , 0e0 , 0x0 , and so on. The empty string ( '' ) and '0' evaluate to a false value, but the strings '0.0' , '0e0' , and so on do not. The idiom '0 but true' evaluates to 0 in numeric context but true in boolean context, thanks to its string contents.

Both the empty list and undef evaluate to a false value. Empty arrays and hashes return the number 0 in scalar context, so they evaluate to a false value in boolean context. An array which contains a single element—even undef —evaluates to true in boolean context. A hash which contains any elements—even a key and a value of undef —evaluates to a true value in boolean context.

Greater Control Over Context

The Want module from the CPAN allows you to detect boolean context within your own functions. The core overloading pragma ( Overloading ) allows you to specify what your own data types produce when evaluated in various contexts.

Looping Directives

Perl provides several directives for looping and iteration. The foreach -style loop evaluates an expression which produces a list and executes a statement or block until it has consumed that list:

This example uses the range operator to produce a list of integers from one to ten inclusive. The foreach directive loops over them, setting the topic variable $_ ( The Default Scalar Variable ) to each in turn. Perl executes the block for each integer and prints the squares of the integers.

foreach versus for

Many Perl programmers refer to iteration as foreach loops, but Perl treats the names foreach and for interchangeably. The subsequent code determines the type and behavior of the loop.

Like if and unless , this loop has a postfix form:

A for loop may use a named variable instead of the topic:

When a for loop uses an iterator variable, the variable scope is within the loop. Perl will set this lexical to the value of each item in the iteration. Perl will not modify the topic variable ( $_ ). If you have declared a lexical $i in an outer scope, its value will persist outside the loop:

This localization occurs even if you do not redeclare the iteration variable as a lexical ... but do declare your iteration variables as lexicals to reduce their scope. :

Iteration and Aliasing

The for loop aliases the iterator variable to the values in the iteration such that any modifications to the value of the iterator modifies the iterated value in place:

This aliasing also works with the block style for loop:

... as well as iteration with the topic variable:

You cannot use aliasing to modify constant values, however:

Instead Perl will produce an exception about modification of read-only values.

You may occasionally see the use of for with a single scalar variable to alias $_ to the variable:

Iteration and Scoping

Iterator scoping with the topic variable provides one common source of confusion. Consider a function topic_mangler() which modifies $_ on purpose. If code iterating over a list called topic_mangler() without protecting $_ , debugging fun would ensue:

If you must use $_ rather than a named variable, make the topic variable lexical with my $_ :

Using a named iteration variable also prevents undesired aliasing behavior through $_ .

The C-Style For Loop

The C-style for loop requires you to manage the conditions of iteration:

You must explicitly assign to an iteration variable in the looping construct, as this loop performs neither aliasing nor assignment to the topic variable. While any variable declared in the loop construct is scoped to the lexical block of the loop, there is no lexicalization of a variable declared outside of the loop construct:

The looping construct may have three subexpressions. The first subexpression—the initialization section—executes only once, before the loop body executes. Perl evaluates the second subexpression—the conditional comparison—before each iteration of the loop body. When this evaluates to a true value, iteration proceeds. When it evaluates to a false value, iteration stops. The final subexpression executes after each iteration of the loop body.

Note the lack of a semicolon after the final subexpression as well as the use of the comma operator and low-precedence and ; this syntax is surprisingly finicky. When possible, prefer the foreach -style loop to the for loop.

All three subexpressions are optional. An infinite for loop might be:

While and Until

A while loop continues until the loop conditional expression evaluates to a boolean false value. An idiomatic infinite loop is:

Unlike the iteration foreach -style loop, the while loop's condition has no side effects by itself. That is, if @values has one or more elements, this code is also an infinite loop:

To prevent such an infinite while loop, use a destructive update of the @values array by modifying the array with each loop iteration:

Modifying @values inside of the while condition check also works, but it has some subtleties related to the truthiness of each value.

This loop will exit as soon as it reaches an element that evaluates to a false value, not necessarily when it has exhausted the array. That may be the desired behavior, but is often surprising to novices.

The until loop reverses the sense of the test of the while loop. Iteration continues while the loop conditional expression evaluates to a false value:

The canonical use of the while loop is to iterate over input from a filehandle:

Perl 5 interprets this while loop as if you had written:

Without the implicit defined , any line read from the filehandle which evaluated to a false value in a scalar context—a blank line or a line which contained only the character 0 —would end the loop. The readline ( <> ) operator returns an undefined value only when it has reached the end of the file.

chomp Your Lines

Use the chomp builtin to remove line-ending characters from each line. Many novices forget this.

Both while and until have postfix forms, such as the infinite loop 1 while 1; . Any single expression is suitable for a postfix while or until , including the classic "Hello, world!" example from 8-bit computers of the early 1980s:

Infinite loops are more useful than they seem, especially for event loops in GUI programs, program interpreters, or network servers:

Use a do block to group several expressions into a single unit:

A do block parses as a single expression which may contain several expressions. Unlike the while loop's block form, the do block with a postfix while or until will execute its body at least once. This construct is less common than the other loop forms, but no less powerful.

Loops within Loops

You may nest loops within other loops:

When you do so, declare named iteration variables! The potential for confusion with the topic variable and its scope is too great otherwise.

A common mistake with nesting foreach and while loops is that it is easy to exhaust a filehandle with a while loop:

Opening the filehandle outside of the for loop leaves the file position unchanged between each iteration of the for loop. On its second iteration, the while loop will have nothing to read and will not execute. To solve this problem, re-open the file inside the for loop (simple to understand, but not always a good use of system resources), slurp the entire file into memory (which may not work if the file is large), or seek the filehandle back to the beginning of the file for each iteration (an often overlooked option):

Loop Control

Sometimes you need to break out of a loop before you have exhausted the iteration conditions. Perl 5's standard control mechanisms—exceptions and return —work, but you may also use loop control statements.

The next statement restarts the loop at its next iteration. Use it when you've done all you need to in the current iteration. To loop over lines in a file but skip everything that starts with the comment character # , write:

Multiple Exits versus Nested Ifs

Compare the use of next with the alternative: wrapping the rest of the body of the block in an if . Now consider what happens if you have multiple conditions which could cause you to skip a line. Loop control modifiers with postfix conditionals can make your code much more readable.

The last statement ends the loop immediately. To finish processing a file once you've seen the ending token, write:

The redo statement restarts the current iteration without evaluating the conditional again. This can be useful in those few cases where you want to modify the line you've read in place, then start processing over from the beginning without clobbering it with another line. To implement a silly file parser that joins lines which end with a backslash:

Using loop control statements in nested loops can be confusing. If you cannot avoid nested loops—by extracting inner loops into named functions—use a loop label to clarify:

The continue construct behaves like the third subexpression of a for loop; Perl executes its block before subsequent iterations of a loop, whether due to normal loop repetition or premature re-iteration from next The Perl equivalent to C's continue is next . . You may use it with a while , until , when , or for loop. Examples of continue are rare, but it's useful any time you want to guarantee that something occurs with every iteration of the loop regardless of how that iteration ends:

Be aware that a continue block does not execute when control flow leaves a loop due to last or redo .

The given construct is a feature new to Perl 5.10. It assigns the value of an expression to the topic variable and introduces a block:

Unlike for , it does not iterate over an aggregate. It evaluates its expression in scalar context, and always assigns to the topic variable:

given also lexicalizes the topic variable:

given is most useful when combined with when ( Smart Matching ). given topicalizes a value within a block so that multiple when statements can match the topic against expressions using smart-match semantics. To write the Rock, Paper, Scissors game:

Perl executes the default rule when none of the other conditions match.

Simplified Dispatch with Multimethods

The CPAN module MooseX::MultiMethods provides another technique to simplify this code.

A tailcall occurs when the last expression within a function is a call to another function—the outer function's return value is the inner function's return value:

Returning from greet_person() directly to the caller of log_and_greet_person() is more efficient than returning to log_and_greet_person() and immediately from log_and_greet_person() . Returning directly from greet_person() to the caller of log_and_greet_person() is a tailcall optimization .

Heavily recursive code ( Recursion ), especially mutually recursive code, can consume a lot of memory. Tailcalls reduce the memory needed for internal bookkeeping of control flow and can make expensive algorithms tractable. Unfortunately, Perl 5 does not automatically perform this optimization; you have to do it yourself when it's necessary.

The builtin goto operator has a form which calls a function as if the current function were never called, essentially erasing the bookkeeping for the new function call. The ugly syntax confuses people who've heard "Never use goto ", but it works:

This example has two important features. First, goto &function_name or goto &$function_reference requires the use of the function sigil ( & ) so that the parser knows to perform a tailcall instead of jumping to a label. Second, this form of function call passes the contents of @_ implicitly to the called function. You may modify @_ to change the passed arguments.

This technique is relatively rare; it's most useful when you want to hijack control flow to get out of the way of other functions inspecting caller (such as when you're implementing special logging or some sort of debugging feature), or when using an algorithm which requires a lot of recursion.

Perl 5's fundamental data type is the scalar , a single, discrete value. That value may be a string, an integer, a floating point value, a filehandle, or a reference—but it is always a single value. Scalars may be lexical, package, or global ( Global Variables ) variables. You may only declare lexical or package variables. The names of scalar variables must conform to standard variable naming guidelines ( Names ). Scalar variables always use the leading dollar-sign ( $ ) sigil ( Variable Sigils ).

Variant Sigils and Context

Scalar values and scalar context have a deep connection; assigning to a scalar provides scalar context. Using the scalar sigil with an aggregate variable imposes scalar context to access a single element of the hash or array.

Scalars and Types

A scalar variable can contain any type of scalar value without special conversions or casts, and the type of value stored in a variable can change:

Even though this code is legal , changing the type of data stored in a scalar is a sign of confusion.

This flexibility of type often leads to value coercion ( Coercion ). For example, you may treat the contents of a scalar as a string, even if you didn't explicitly assign it a string:

You may also use mathematical operations on strings:

One-Way Increment Magic

This magical string increment behavior has no corresponding magical decrement behavior. You can't get the previous string value by writing $call_sign-- .

This string increment operation turns a into b and z into aa , respecting character set and case. While ZZ9 becomes AAA0 , ZZ09 becomes ZZ10 —numbers wrap around while there are more significant places to increment, as on a vehicle odometer.

Evaluating a reference ( References ) in string context produces a string. Evaluating a reference in numeric context produces a number. Neither operation modifies the reference in place, but you cannot recreate the reference from either result:

$authors is still useful as a reference, but $stringy_ref is a string with no connection to the reference and $numeric_ref is a number with no connection to the reference.

To allow coercion without data loss, Perl 5 scalars can contain both numeric and string components. The internal data structure which represents a scalar in Perl 5 has a numeric slot and a string slot. Accessing a string in a numeric context produces a scalar with both string and numeric values. The dualvar() function within the core Scalar::Util module allows you to manipulate both values directly within a single scalar.

Scalars do not contain a separate slot for boolean values. In boolean context, the empty string ( '' ) and '0' are false. All other strings are true. In boolean context, numbers which evaluate to zero ( 0 , 0.0 , and 0e0 ) are false. All other numbers are true.

What is Truth?

Be careful that the strings '0.0' and '0e0' are true; this is one place where Perl 5 makes a distinction between what looks like a number and what really is a number.

One other value is always false: undef . This is the value of uninitialized variables as well as a value in its own right.

Perl 5 arrays are first-class data structures—the language supports them as a built-in data type—which store zero or more scalars. You can access individual members of the array by integer indexes, and you can add or remove elements at will. The @ sigil denotes an array. To declare an array:

Array Elements

Accessing an individual element of an array in Perl 5 requires the scalar sigil. $cats[0] is an unambiguous use of the @cats array, because postfix ( Fixity ) square brackets ( [] ) always mean indexed access to an array.

The first element of an array is at index zero:

The last index of an array depends on the number of elements in the array. An array in scalar context (due to scalar assignment, string concatenation, addition, or boolean context) evaluates to the number of elements in the array:

To get the index of the final element of an array, subtract one from the number of elements of the array (remember that array indexes start at 0) or use the unwieldy $#cats syntax:

When the index matters less than the position of an element, use negative array indices instead. The last element of an array is available at the index -1 . The second to last element of the array is available at index -2 , and so on:

$# has another use: resize an array in place by assigning to it. Remember that Perl 5 arrays are mutable. They expand or contract as necessary. When you shrink an array, Perl will discard values which do not fit in the resized array. When you expand an array, Perl will fill the expanded positions with undef .

Array Assignment

Assign to individual positions in an array directly by index:

If you assign to an index beyond the array's current bound, Perl will extend the array to account for the new size and will fill in all intermediary positions with undef . After the first assignment, the array will contain undef at positions 0, 1, and 2 and Jack at position 3.

As an assignment shortcut, initialize an array from a list:

... but remember that these parentheses do not create a list. Without parentheses, this would assign Daisy as the first and only element of the array, due to operator precedence ( Precedence ).

Any expression which produces a list in list context can assign to an array:

Assigning to a scalar element of an array imposes scalar context, while assigning to the array as a whole imposes list context.

To clear an array, assign an empty list:

Arrays Start Empty

my @items = (); is a longer and noisier version of my @items because freshly-declared arrays start out empty.

Array Operations

Sometimes an array is more convenient as an ordered, mutable collection of items than as a mapping of indices to values. Perl 5 provides several operations to manipulate array elements without using indices.

The push and pop operators add and remove elements from the tail of an array, respectively:

You may push a list of values onto an array, but you may only pop one at a time. push returns the new number of elements in the array. pop returns the removed element.

Because push operates on a list, you can easily append the elements of one or more arrays to another with:

Similarly, unshift and shift add elements to and remove an element from the start of an array, respectively:

unshift prepends a list of elements to the start of the array and returns the new number of elements in the array. shift removes and returns the first element of the array.

Few programs use the return values of push and unshift .

The splice operator removes and replaces elements from an array given an offset, a length of a list slice, and replacements. Both replacing and removing are optional; you may omit either behavior. The perlfunc description of splice demonstrates its equivalences with push , pop , shift , and unshift . One effective use is removal of two elements from an array:

Prior to Perl 5.12, iterating over an array by index required a C-style loop. As of Perl 5.12, each can iterate over an array by index and value:

Array Slices

The array slice construct allows you to access elements of an array in list context. Unlike scalar access of an array element, this indexing operation takes a list of zero or more indices and uses the array sigil ( @ ):

Array slices are useful for assignment:

A slice can contain zero or more elements—including one:

The only syntactic difference between an array slice of one element and the scalar access of an array element is the leading sigil. The semantic difference is greater: an array slice always imposes list context. An array slice evaluated in scalar context will produce a warning:

An array slice imposes list context on the expression used as its index:

Arrays and Context

In list context, arrays flatten into lists. If you pass multiple arrays to a normal Perl 5 function, they will flatten into a single list:

Within the function, @_ will contain seven elements, not two, because list assignment to arrays is greedy . An array will consume as many elements from the list as possible. After the assignment, @cats will contain every argument passed to the function. @dogs will be empty.

This flattening behavior sometimes confuses novices who attempt to create nested arrays in Perl 5:

... but this code is effectively the same as:

... because these parentheses merely group expressions. They do not create lists in these circumstances. To avoid this flattening behavior, use array references ( Array References ).

Array Interpolation

Arrays interpolate in strings as lists of the stringifications of each item separated by the current value of the magic global $" . The default value of this variable is a single space. Its English.pm mnemonic is $LIST_SEPARATOR . Thus:

Localize $" with a delimiter to ease your debugging Credit goes to Mark Jason Dominus for this technique. :

A hash is a first-class Perl data structure which associates string keys with scalar values. In the same way that the name of a variable corresponds to a storage location, a key in a hash refers to a value. Think of a hash like you would a telephone book: use the names of your friends to look up their numbers. Other languages call hashes tables , associative arrays , dictionaries , or maps .

Hashes have two important properties: they store one scalar per unique key and they provide no specific ordering of keys.

Declaring Hashes

Hashes use the % sigil. Declare a lexical hash with:

A hash starts out empty. You could write my %favorite_flavors = (); , but that's redundant.

Hashes use the scalar sigil $ when accessing individual elements and curly braces { } for keyed access:

Assign a list of keys and values to a hash in a single expression:

If you assign an odd number of elements to the hash, you will receive a warning to that effect. Idiomatic Perl often uses the fat comma operator ( => ) to associate values with keys, as it makes the pairing more visible:

The fat comma operator acts like the regular comma, but also automatically quotes the previous bareword ( Barewords ). The strict pragma will not warn about such a bareword—and if you have a function with the same name as a hash key, the fat comma will not call the function:

The key of this hash will be name and not Leonardo . To call the function, make the function call explicit:

Assign an empty list to empty a hash You may occasionally see undef %hash . :

Hash Indexing

Access individual hash values with an indexing operation. Use a key (a keyed access operation) to retrieve a value from a hash:

In this example, $name contains a string which is also a key of the hash. As with accessing an individual element of an array, the hash's sigil has changed from % to $ to indicate keyed access to a scalar value.

You may also use string literals as hash keys. Perl quotes barewords automatically according to the same rules as fat commas:

Don't Quote Me

Novices often always quote string literal hash keys, but experienced developers elide the quotes whenever possible. In this way, the presence of quotes in hash keys signifies an intention to do something different.

Even Perl 5 builtins get the autoquoting treatment:

The unary plus ( Unary Coercions ) turns what would be a bareword ( shift ) subject to autoquoting rules into an expression. As this implies, you can use an arbitrary expression—not only a function call—as the key of a hash:

Hash keys can only be strings. Anything that evaluates to a string is an acceptable hash key. Perl will go so far as to coerce ( Coercion ) any non-string into a string, such that if you use an object as a hash key, you'll get the stringified version of that object instead of the object itself:

Hash Key Existence

The exists operator returns a boolean value to indicate whether a hash contains the given key:

Using exists instead of accessing the hash key directly avoids two problems. First, it does not check the boolean nature of the hash value ; a hash key may exist with a value even if that value evaluates to a boolean false (including undef ):

Second, exists avoids autovivification ( Autovivification ) within nested data structures ( Nested Data Structures ).

If a hash key exists, its value may be undef . Check that with defined :

Accessing Hash Keys and Values

Hashes are aggregate variables, but their pairwise nature offers many more possibilities for iteration: over the keys of a hash, the values of a hash, or pairs of keys and values. The keys operator produces a list of hash keys:

The values operator produces a list of hash values:

The each operator produces a list of two-element lists of the key and the value:

Unlike arrays, there is no obvious ordering to these lists. The ordering depends on the internal implementation of the hash, the particular version of Perl you are using, the size of the hash, and a random factor. Even so, the order of hash items is consistent between keys , values , and each . Modifying the hash may change the order, but you can rely on that order if the hash remains the same.

Each hash has only a single iterator for the each operator. You cannot reliably iterate over a hash with each more than once; if you begin a new iteration while another is in progress, the former will end prematurely and the latter will begin partway through the hash. During such iteration, beware not to call any function which may itself try to iterate over the hash with each .

In practice this occurs rarely, but reset a hash's iterator with keys or values in void context when you need it:

Hash Slices

A hash slice is a list of keys or values of a hash indexed in a single operation. To initialize multiple elements of a hash at once:

This is equivalent to the initialization:

... except that the hash slice initialization does not replace the existing contents of the hash.

Hash slices also allow you to retrieve multiple values from a hash in a single operation. As with array slices, the sigil of the hash changes to indicate list context. The use of the curly braces indicates keyed access and makes the hash unambiguous:

Hash slices make it easy to merge two hashes:

This is equivalent to looping over the contents of %canada_addresses manually, but is much shorter.

What if the same key occurs in both hashes? The hash slice approach always overwrites existing key/value pairs in %addresses . If you want other behavior, looping is more appropriate.

The Empty Hash

An empty hash contains no keys or values. It evaluates to a false value in a boolean context. A hash which contains at least one key/value pair evaluates to a true value in boolean context even if all of the keys or all of the values or both would themselves evaluate to false values in a boolean context.

In scalar context, a hash evaluates to a string which represents the ratio of full buckets in the hash—internal details about the hash implementation that you can safely ignore.

In list context, a hash evaluates to a list of key/value pairs similar to what you receive from the each operator. However, you cannot iterate over this list the same way you can iterate over the list produced by each , lest the loop will never terminate:

You can loop over the list of keys and values with a for loop, but the iterator variable will get a key on one iteration and its value on the next, because Perl will flatten the hash into a single list of interleaved keys and values.

Hash Idioms

Because each key exists only once in a hash, assigning the same key to a hash multiple times stores only the most recent key. Use this to find unique list elements:

Using undef with a hash slice sets the values of the hash to undef . This idiom is the cheapest way to perform set operations with a hash.

Hashes are also useful for counting elements, such as IP addresses in a log file:

The initial value of a hash value is undef . The postincrement operator ( ++ ) treats that as zero. This in-place modification of the value increments an existing value for that key. If no value exists for that key, Perl creates a value ( undef ) and immediately increments it to one, as the numification of undef produces the value 0.

This strategy provides a useful caching mechanism to store the result of an expensive operation with little overhead:

This orcish maneuver Or-cache, if you like puns. returns the value from the hash, if it exists. Otherwise, it calculates, caches, and returns the value. The defined-or assignment operator ( //= ) evaluates its left operand. If that operand is not defined, the operator assigns the lvalue the value of its right operand. In other words, if there's no value in the hash for the given key, this function will call create_user() with the key and update the hash.

Perl 5.10 introduced the defined-or and defined-or assignment operators. Prior to 5.10, most code used the boolean-or assignment operator ( ||= ) for this purpose. Unfortunately, some valid values evaluate to a false value in boolean context, so evaluating the definedness of values is almost always more accurate. This lazy orcish maneuver tests for the definedness of the cached value, not truthiness.

If your function takes several arguments, use a slurpy hash ( Slurping ) to gather key/value pairs into a single hash as named function arguments:

This approach allows you to set default values:

... or include them in the hash initialization, as latter assignments take precedence over earlier assignments:

Locking Hashes

As hash keys are barewords, they offer little typo protection compared to the function and variable name protection offered by the strict pragma. The little-used core module Hash::Util provides mechanisms to ameliorate this.

To prevent someone from accidentally adding a hash key you did not intend (whether as a typo or from untrusted user input), use the lock_keys() function to restrict the hash to its current set of keys. Any attempt to add a new key to the hash will raise an exception. This is lax security suitable only for preventing accidents; anyone can use the unlock_keys() function to remove this protection.

Similarly you can lock or unlock the existing value for a given key in the hash ( lock_value() and unlock_value() ) and make or unmake the entire hash read-only with lock_hash() and unlock_hash() .

A Perl variable can hold at various times values of different types—strings, integers, rational numbers, and more. Rather than attaching type information to variables, Perl relies on the context provided by operators ( Numeric, String, and Boolean Context ) to know what to do with values. By design, Perl attempts to do what you mean Called DWIM for do what I mean or dwimmery . , though you must be specific about your intentions. If you treat a variable which happens to contain a number as a string, Perl will do its best to coerce that number into a string.

Boolean Coercion

Boolean coercion occurs when you test the truthiness of a value, such as in an if or while condition. Numeric 0, undef , the empty string, and the string '0' all evaluate as false. All other values—including strings which may be numerically equal to zero (such as '0.0' , '0e' , and '0 but true' )—evaluate as true.

When a scalar has both string and numeric components ( Dualvars ), Perl 5 prefers to check the string component for boolean truth. '0 but true' evaluates to zero numerically, but it is not an empty string, thus it evaluates to a true value in boolean context.

String Coercion

String coercion occurs when using string operators such as comparisons ( eq and cmp ), concatenation, split , substr , and regular expressions, as well as when using a value as a hash key. The undefined value stringifies to an empty string, produces a "use of uninitialized value" warning. Numbers stringify to strings containing their values, such that the value 10 stringifies to the string 10 . You can even split a number into individual digits with:

Numeric Coercion

Numeric coercion occurs when using numeric comparison operators (such as == and <=> ), when performing mathematic operations, and when using a value as an array or list index. The undefined value numifies to zero and produces a "Use of uninitialized value" warning. Strings which do not begin with numeric portions also numify to zero and produce an "Argument isn't numeric" warning. Strings which begin with characters allowed in numeric literals numify to those values and produce no warnings, such that 10 leptons leaping numifies to 10 and 6.022e23 moles marauding numifies to 6.022e23 .

The core module Scalar::Util contains a looks_like_number() function which uses the same parsing rules as the Perl 5 grammar to extract a number from a string.

Mathematicians Rejoice

The strings Inf and Infinity represent the infinite value and behave as numbers. The string NaN represents the concept "not a number". Numifying them produces no "Argument isn't numeric" warning.

Reference Coercion

Using a dereferencing operation on a non-reference turns that value into a reference. This process of autovivification ( Autovivification ) is handy when manipulating nested data structures ( Nested Data Structures ):

Although the hash never contained values for Brad and Jack , Perl helpfully created hash references for them, then assigned each a key/value pair keyed on id .

Cached Coercions

Perl 5's internal representation of values stores both string and numeric values. Stringifying a numeric value does not replace the numeric value. Instead, it attaches a stringified value, so that the representation contains both components. Similarly, numifying a string value populates the numeric component while leaving the string component untouched.

Certain Perl operations prefer to use one component of a value over another—boolean checks prefer strings, for example. If a value has a cached representation in a form you do not expect, relying on an implicit conversion may produce surprising results. You almost never need to be explicit about what you expect Your author can recall doing so twice in over a decade of programming Perl 5 , but knowing that this caching occurs may someday help you diagnose an odd situation.

The multi-component nature of Perl values is available to users in the form of dualvars . The core module Scalar::Util provides a function dualvar() which allows you to bypass Perl coercion and manipulate the string and numeric components of a value separately:

A Perl namespace associates and encapsulates various named entities within a named category, like your family name or a brand name. Unlike a real-world name, a namespace implies no direct relationship between entities. Such relationships may exist, but do not have to.

A package in Perl 5 is a collection of code in a single namespace. The distinction is subtle: the package represents the source code and the namespace represents the entity created when Perl parses that code.

The package builtin declares a package and a namespace:

All global variables and functions declared or referred to after the package declaration refer to symbols within the MyCode namespace. You can refer to the @boxes variable from the main namespace only by its fully qualified name of @MyCode::boxes . A fully qualified name includes a complete package name, so you can call the add_box() function only by MyCode::add_box() .

The scope of a package continues until the next package declaration or the end of the file, whichever comes first. Perl 5.14 enhanced package so that you may provide a block which explicitly delineates the scope of the declaration:

The default package is the main package. Without a package declaration, the current package is main . This rule applies to one-liners, standalone programs, and even .pm files.

Besides a name, a package has a version and three implicit methods, import() ( Importing ), unimport() , and VERSION() . VERSION() returns the package's version number. This number is a series of numbers contained in a package global named $VERSION . By rough convention, versions tend to be a series of integers separated by dots, as in 1.23 or 1.1.10 , where each segment is an integer.

Perl 5.12 introduced a new syntax intended to simplify version numbers, as documented in perldoc version::Internals . These stricter version numbers must have a leading v character and at least three integer components separated by periods:

With Perl 5.14, the optional block form of a package declaration is:

In 5.10 and earlier, the simplest way to declare the version of a package is:

Every package inherits a VERSION() method from the UNIVERSAL base class. You may override VERSION() , though there are few reasons to do so. This method returns the value of $VERSION :

If you provide a version number as an argument, this method will throw an exception unless the version of the module is equal to or greater than the argument:

Packages and Namespaces

Every package declaration creates a new namespace if necessary and causes the parser to put all subsequent package global symbols (global variables and functions) into that namespace.

Perl has open namespaces . You can add functions or variables to a namespace at any point, either with a new package declaration:

... or by fully qualifying function names at the point of declaration:

You can add to a package at any point during compilation or runtime, regardless of the current file, though building up a package from multiple separate declarations can make code difficult to spelunk.

Namespaces can have as many levels as your organizational scheme requires, though namespaces are not hierarchical. The only relationship between packages is semantic, not technical. Many projects and businesses create their own top-level namespaces. This reduces the possibility of global conflicts and helps to organize code on disk. For example:

  • StrangeMonkey is the project name
  • StrangeMonkey::UI contains top-level user interface code
  • StrangeMonkey::Persistence contains top-level data management code
  • StrangeMonkey::Test contains top-level testing code for the project

... and so on.

Perl usually does what you expect, even if what you expect is subtle. Consider what happens when you pass values to functions:

Outside of the function, $name contains Chuck , even though the value passed into the function gets reversed into kcuhC . You probably expected that. The value of $name outside the function is separate from the $name inside the function. Modifying one has no effect on the other.

Consider the alternative. If you had to make copies of every value before anything could possibly change them out from under you, you'd have to write lots of extra defensive code.

Yet sometimes it's useful to modify values in place. If you want to pass a hash full of data to a function to modify it, creating and returning a new hash for each change could be troublesome (to say nothing of inefficient).

Perl 5 provides a mechanism by which to refer to a value without making a copy. Any changes made to that reference will update the value in place, such that all references to that value can reach the new value. A reference is a first-class scalar data type in Perl 5 which refers to another first-class data type.

Scalar References

The reference operator is the backslash ( \ ). In scalar context, it creates a single reference which refers to another value. In list context, it creates a list of references. To take a reference to $name :

You must dereference a reference to evaluate the value to which it refers. Dereferencing requires you to add an extra sigil for each level of dereferencing:

The double scalar sigil ( $$ ) dereferences a scalar reference.

While in @_ , parameters behave as aliases to caller variables Remember that for loops produce a similar aliasing behavior. , so you can modify them in place:

You usually don't want to modify values this way—callers rarely expect it, for example. Assigning parameters to lexicals within your functions removes this aliasing behavior.

Saving Memory with References

Modifying a value in place, or returning a reference to a scalar can save memory. Because Perl copies values on assignment, you could end up with multiple copies of a large string. Passing around references means that Perl will only copy the references—a far cheaper operation.

Complex references may require a curly-brace block to disambiguate portions of the expression. You may always use this syntax, though sometimes it clarifies and other times it obscures:

If you forget to dereference a scalar reference, Perl will likely coerce the reference. The string value will be of the form SCALAR(0x93339e8) , and the numeric value will be the 0x93339e8 portion. This value encodes the type of reference (in this case, SCALAR ) and the location in memory of the reference.

References Aren't Pointers

Perl does not offer native access to memory locations. The address of the reference is a value used as an identifier. Unlike pointers in a language such as C, you cannot modify the address or treat it as an address into memory. These addresses are only mostly unique because Perl may reuse storage locations as it reclaims unused memory.

Array References

Array references are useful in several circumstances:

  • To pass and return arrays from functions without flattening
  • To create multi-dimensional data structures
  • To avoid unnecessary array copying
  • To hold anonymous data structures

Use the reference operator to create a reference to a declared array:

Any modifications made through $cards_ref will modify @cards and vice versa. You may access the entire array as a whole with the @ sigil, whether to flatten the array into a list or count its elements:

Access individual elements by using the dereferencing arrow ( -> ):

The arrow is necessary to distinguish between a scalar named $cards_ref and an array named @cards_ref . Note the use of the scalar sigil ( Variable Sigils ) to access a single element.

Doubling Sigils

An alternate syntax prepends another scalar sigil to the array reference. It's shorter, if uglier, to write my $first_card = $$cards_ref[0] ; .

Use the curly-brace dereferencing syntax to slice ( Array Slices ) an array reference:

You may omit the curly braces, but their grouping often improves readability.

To create an anonymous array—without using a declared array—surround a list of values with square brackets:

This array reference behaves the same as named array references, except that the anonymous array brackets always create a new reference. Taking a reference to a named array always refers to the same array with regard to scoping. For example:

... both $sunday_ref and $monday_ref now contain a dessert, while:

... neither $sunday_ref nor $monday_ref contains a dessert. Within the square braces used to create the anonymous array, list context flattens the @meals array into a list unconnected to @meals .

Hash References

Use the reference operator on a named hash to create a hash reference :

Access the keys or values of the hash by prepending the reference with the hash sigil % :

Access individual values of the hash (to store, delete, check the existence of, or retrieve) by using the dereferencing arrow or double sigils:

Use the array sigil ( @ ) and disambiguation braces to slice a hash reference:

Create anonymous hashes in place with curly braces:

As with anonymous arrays, anonymous hashes create a new anonymous hash on every execution.

Watch Those Braces!

The common novice error of assigning an anonymous hash to a standard hash produces a warning about an odd number of elements in the hash. Use parentheses for a named hash and curly brackets for an anonymous hash.

Automatic Dereferencing

As of Perl 5.14, Perl can automatically dereference certain references on your behalf. Given an array reference in $arrayref , you can write:

Given an expression which returns an array reference, you can do the same:

The same goes for the array operators pop , shift , unshift , splice , keys , values , and each and the hash operators keys , values , and each .

If the reference provided is not of the proper type—if it does not dereference properly—Perl will throw an exception. While this may seem more dangerous than explicitly dereferencing references directly, it is in fact the same behavior:

Function References

Perl 5 supports first-class functions in that a function is a data type just as is an array or hash. This is most obvious with function references , and enables many advanced features ( Closures ). Create a function reference by using the reference operator on the name of a function:

Without the function sigil ( & ), you will take a reference to the function's return value or values.

Create anonymous functions with the bare sub keyword:

The use of the sub builtin without a name compiles the function as normal, but does not install it in the current namespace. The only way to access this function is via the reference returned from sub . Invoke the function reference with the dereferencing arrow:

Perl 4 Function Calls

An alternate invocation syntax for function references uses the function sigil ( & ) instead of the dereferencing arrow. Avoid this syntax; it has subtle implications for parsing and argument passing.

Think of the empty parentheses as denoting an invocation dereferencing operation in the same way that square brackets indicate an indexed lookup and curly brackets cause a hash lookup. Pass arguments to the function within the parentheses:

You may also use function references as methods with objects ( Moose ). This is useful when you've already looked up the method ( Reflection ):

Filehandle References

When you use open 's (and opendir 's) lexical filehandle form, you deal with filehandle references. Internally, these filehandles are IO::File objects. You can call methods on them directly. As of Perl 5.14, this is as simple as:

You must use IO::File; in 5.12 to enable this and use IO::Handle; in 5.10 and earlier. Even older code may take references to typeglobs:

This idiom predates lexical filehandles (introduced with Perl 5.6.0 in March 2000). You may still use the reference operator on typeglobs to take references to package-global filehandles such as STDIN , STDOUT , STDERR , or DATA —but these are all global names anyhow.

Prefer lexical filehandles when possible. With the benefit of explicit scoping, lexical filehandles allow you to manage the lifespan of filehandles as a feature of Perl 5's memory management.

Reference Counts

Perl 5 uses a memory management technique known as reference counting . Every Perl value has a counter attached. Perl increases this counter every time something takes a reference to the value, whether implicitly or explicitly. Perl decreases that counter every time a reference goes away. When the counter reaches zero, Perl can safely recycle that value.

How does Perl know when it can safely release the memory for a variable? How does Perl know when it's safe to close the file opened in this inner scope:

Within the inner block in the example, there's one $fh . (Multiple lines in the source code refer to it, but only one variable refers to it: $fh .) $fh is only in scope in the block. Its value never leaves the block. When execution reaches the end of the block, Perl recycles the variable $fh and decreases the reference count of the contained filehandle. The filehandle's reference count reaches zero, so Perl recycles it to reclaim memory, and calls close() implicitly.

You don't have to understand the details of how all of this works. You only need to understand that your actions in taking references and passing them around affect how Perl manages memory (see Circular References ).

References and Functions

When you use references as arguments to functions, document your intent carefully. Modifying the values of a reference from within a function may surprise the calling code, which doesn't expect anything else to modify its data. To modify the contents of a reference without affecting the reference itself, copy its values to a new variable:

This is only necessary in a few cases, but explicit cloning helps avoid nasty surprises for the calling code. If you use nested data structures or other complex references, consider the use of the core module Storable and its dclone ( deep cloning ) function.

Nested Data Structures

Perl's aggregate data types—arrays and hashes—allow you to store scalars indexed by integer or string keys. Perl 5's references ( References ) allow you to access aggregate data types through special scalars. Nested data structures in Perl, such as an array of arrays or a hash of hashes, are possible through the use of references.

Use the anonymous reference declaration syntax to declare a nested data structure:

Commas are Free

Perl allows but does not require the trailing comma so as to ease adding new elements to the list.

Use Perl's reference syntax to access elements in nested data structures. The sigil denotes the amount of data to retrieve, and the dereferencing arrow indicates that the value of one portion of the data structure is a reference:

The only way to nest a multi-level data structure is through references, so the arrow is superfluous. You may omit it for clarity, except for invoking function references:

Use disambiguation blocks to access components of nested data structures as if they were first-class arrays or hashes:

... or to slice a nested data structure:

Whitespace helps, but does not entirely eliminate the noise of this construct. Use temporary variables to clarify:

... or use for 's implicit aliasing to $_ to avoid the use of an intermediate reference:

perldoc perldsc , the data structures cookbook, gives copious examples of how to use Perl's various data structures.

Autovivification

When you attempt to write to a component of a nested data structure, Perl will create the path through the data structure to the destination as necessary:

After the second line of code, this array of arrays of arrays of arrays contains an array reference in an array reference in an array reference in an array reference. Each array reference contains one element. Similarly, treating an undefined value as if it were a hash reference in a nested data structure will make it so:

This useful behavior is autovivification . While it reduces the initialization code of nested data structures, it cannot distinguish between the honest intent to create missing elements in nested data structures and typos. The autovivification pragma ( Pragmas ) from the CPAN lets you disable autovivification in a lexical scope for specific types of operations.

You may wonder at the contradiction between taking advantage of autovivification while enabling strict ures. The question is one of balance. Is it more convenient to catch errors which change the behavior of your program at the expense of disabling error checks for a few well-encapsulated symbolic references? Is it more convenient to allow data structures to grow rather than specifying their size and allowed keys?

The answers depend on your project. During early development, allow yourself the freedom to experiment. While testing and deploying, consider an increase of strictness to prevent unwanted side effects. Thanks to the lexical scoping of the strict and autovivification pragmas, you can enable these behaviors where and as necessary.

You can verify your expectations before dereferencing each level of a complex data structure, but the resulting code is often lengthy and tedious. It's better to avoid deeply nested data structures by revising your data model to provide better encapsulation.

Debugging Nested Data Structures

The complexity of Perl 5's dereferencing syntax combined with the potential for confusion with multiple levels of references can make debugging nested data structures difficult. Two good visualization tools exist.

The core module Data::Dumper converts values of arbitrary complexity into strings of Perl 5 code:

This is useful for identifying what a data structure contains, what you should access, and what you accessed instead. Data::Dumper can dump objects as well as function references (if you set $Data::Dumper::Deparse to a true value).

While Data::Dumper is a core module and prints Perl 5 code, its output is verbose. Some developers prefer the use of the YAML::XS or JSON modules for debugging. They do not produce Perl 5 code, but their outputs can be much clearer to read and to understand.

Circular References

Perl 5's memory management system of reference counting ( Reference Counts ) has one drawback apparent to user code. Two references which eventually point to each other form a circular reference that Perl cannot destroy on its own. Consider a biological model, where each entity has two parents and zero or more children:

Both $alice and $robert contain an array reference which contains $cianne . Because $cianne is a hash reference which contains $alice and $robert , Perl can never decrease the reference count of any of these three people to zero. It doesn't recognize that these circular references exist, and it can't manage the lifespan of these entities.

Either break the reference count manually yourself (by clearing the children of $alice and $robert or the parents of $cianne ), or use weak references . A weak reference is a reference which does not increase the reference count of its referent. Weak references are available through the core module Scalar::Util . Its weaken() function prevents a reference count from increasing:

Now $cianne will retain references to $alice and $robert , but those references will not by themselves prevent Perl's garbage collector from destroying those data structures. Most data structures do not need weak references, but when they're necessary, they're invaluable.

Alternatives to Nested Data Structures

While Perl is content to process data structures nested as deeply as you can imagine, the human cost of understanding these data structures and their relationships—to say nothing of the complex syntax—is high. Beyond two or three levels of nesting, consider whether modeling various components of your system with classes and objects ( Moose ) will allow for clearer code.

  • Trending Now
  • Foundational Courses
  • Data Science
  • Practice Problem
  • Machine Learning
  • System Design
  • DevOps Tutorial
  • Perl Programming Language
  • Introduction to Perl
  • Perl Installation and Environment Setup in Windows, Linux, and MacOS
  • Perl | Basic Syntax of a Perl Program
  • Hello World Program in Perl

Fundamentals

  • Perl | Data Types
  • Perl | Boolean Values
  • Perl | Operators | Set - 1
  • Perl | Operators | Set - 2

Perl | Variables

  • Perl | Modules
  • Packages in Perl

Control Flow

  • Perl | Decision Making (if, if-else, Nested–if, if-elsif ladder, unless, unless-else, unless-elsif)
  • Perl | Loops (for, foreach, while, do...while, until, Nested loops)
  • Perl | given-when Statement
  • Perl | goto statement

Arrays & Lists

  • Perl | Arrays
  • Perl | Array Slices
  • Perl | Arrays (push, pop, shift, unshift)
  • Perl List and its Types
  • Perl | Hash Operations
  • Perl | Multidimensional Hashes
  • Perl | Scalars
  • Perl | Comparing Scalars
  • Perl | scalar keyword
  • Perl | Quoted, Interpolated and Escaped Strings
  • Perl | String Operators
  • Perl | String functions (length, lc, uc, index, rindex)

OOP Concepts

  • Object Oriented Programming (OOPs) in Perl
  • Perl | Classes in OOP
  • Perl | Objects in OOPs
  • Perl | Methods in OOPs
  • Perl | Constructors and Destructors
  • Perl | Method Overriding in OOPs
  • Perl | Inheritance in OOPs
  • Perl | Polymorphism in OOPs
  • Perl | Encapsulation in OOPs

Regular Expressions

  • Perl | Regular Expressions
  • Perl | Operators in Regular Expression
  • Perl | Regex Character Classes
  • Perl | Quantifiers in Regular Expression

File Handling

  • Perl | File Handling Introduction
  • Perl | Opening and Reading a File
  • Perl | Writing to a File
  • Perl | Useful File-handling functions

CGI Programming

  • Perl | CGI Programming
  • Perl | File Upload in CGI
  • Perl | GET vs POST in CGI

Variables in Perl are used to store and manipulate data throughout the program. When a variable is created it occupies memory space. The data type of a variable helps the interpreter to allocate memory and decide what to be stored in the reserved memory. Therefore, variables can store integers, decimals, or strings with the assignment of different data types to the variables. 

A variable in Perl can be named anything with the use of a specific datatype. There are some rules to follow while naming a variable: 

  • Variables in Perl are case-sensitive.   

Example:  

  • It starts with $, @ or % as per the datatype required, followed by zero or more letters, underscores, and digits
  • Variables in Perl cannot contain white spaces or any other special character except underscore. 

   Variable Declaration is done on the basis of the datatype used to define the variable. These variables can be of three different datatypes:   

  • Scalar Variables: It contains a single string or numeric value. It starts with $ symbol.   
Syntax: $var_name = value;
  • Array Variables: It contains a randomly ordered set of values. It starts with @ symbol. 
Syntax : @var_name = (val1, val2, val3, …..);
  • Hash Variables: It contains (key, value) pair efficiently accessed per key. It starts with % symbol. 
Syntax : %var_name = ( key1=>val1, key2=>val2, key3=>val3, …..);

   Perl allows modifying its variable values anytime after the variable declaration is done. There are various ways for the modification of a variable:  

  • A scalar variable can be modified simply by redefining its value. 
  • An element of an array can be modified by passing the index of that element to the array and defining a new value to it. 
  • A value in a hash can be modified by using its Key. 

   Perl provides various methods to define a String to a variable. This can be done with the use of single quotes, double quotes, using q-operator and double-q operator, etc.  Using single quotes and double quotes for writing strings is the same but there exists a slight difference between how they work. Strings that are written with the use of single quotes display the content written within it exactly as it is.   

The above code will print:

Whereas strings written within double quotes replace the variables with their value and then display the string. It even replaces the escape sequences with their real use. Example:  

The above code will print: 

Example Code: 

Please Login to comment...

Similar reads.

  • perl-basics
  • perl-data-types

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

Advanced Perl Maven

  • Introduction to the Advanced Perl course
  • Perl 4 libraries
  • The problem with Perl 4 style libraries
  • Namespaces and packages in Perl
  • Modules in Perl
  • How does require find the module to be loaded?
  • What is the difference between require and use in Perl? What does import do?
  • Exporting and importing functions easily
  • Restrict the import by listing the functions to be imported
  • Import on demand
  • Modules - Behind the scenes
  • Tools to package Perl scripts, modules, and applications
  • Distribution directory layout
  • Makefile.PL of ExtUtils::MakeMaker
  • Makefile.PL of Module::Install
  • Build.PL of Module::Build
  • Changes and README files in a Perl distribution
  • MANIFEST and MANIFEST.SKIP
  • Packaging a Perl script and a Perl module
  • Packaging with Makefile.PL
  • Packaging with Build.PL
  • A test file in Perl
  • How to create a Perl Module for code reuse?
  • Passing two arrays to a function
  • Array references in Perl

Static and state variables in Perl

  • Filtering values using Perl grep
  • Transforming a Perl array using map
  • Core Perl OOP: Constructor
  • Core Perl OOP: attributes, getter - setter
  • What should setters return? (Examples with core Perl OOP)
  • Core Perl OOP: Constructor arguments
  • Accessor with type constraint
  • Class as type constraint
  • Always use strict and use warnings in your perl code!
  • How to capture and save warnings in Perl
  • use diagnostic; or use splain
  • Splice to slice and dice arrays in Perl
  • AUTOLOAD - handling Undefined subroutines
  • BEGIN block - running code during compilation
  • END block - running code after the application has ended
  • How to sort faster in Perl? (using the Schwartzian transform)
  • $_ the default variable of Perl

state variable

State is executed in the first call, static variables in the "traditional" way, first assignment time, shared static variable, static arrays and hashes.

Gabor Szabo

Published on 2014-01-15

Author: Gabor Szabo

perl invalid initialization by assignment

  • about the translations
  • Global symbol requires explicit package name
  • Variable declaration in Perl
  • What's new in Perl 5.10? say, //, state
  • Stack Overflow Public questions & answers
  • Stack Overflow for Teams Where developers & technologists share private knowledge with coworkers
  • Talent Build your employer brand
  • Advertising Reach developers & technologists worldwide
  • Labs The future of collective knowledge sharing
  • About the company

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Get early access and see previews of new features.

How can I automatically initialize all the scalar variables in Perl?

Perl automatically initializes variables to undef by default.

Is there a way to override this default behavior and tell the Perl interpreter to initialize variables to zero (or some other fixed value)?

  • initialization

Community's user avatar

  • Initializing variables to undef is still an initialization. –  zigdon Nov 3, 2010 at 19:07
  • 7 May I ask why you would want to do this? –  Zaid Nov 3, 2010 at 19:11
  • @Zaid: The full story - I am reading Code Complete (which mentions always initializing variables while declaration, if possible) and was looking at one of my old files that uses a lot of counters that I had all initialized to zero initially but later removed all the unnecessary initializations. I am thinking about reintroducing all the initializations, and asked this question to find a better way to do it. –  Lazer Nov 3, 2010 at 19:21

5 Answers 5

The recommendation in Code Complete is important for language such as C because when you have

the value of counter is whatever happens to occupy that memory.

In Perl, when you declare a variable using

there is no doubt that the value of $counter is undef not some random garbage.

Therefore, the motivation behind the recommendation, i.e. to ensure that all variables start out with known values, is automatically satisfied in Perl and it is not necessary to do anything.

What you do with counters is to increment or decrement them. The result of:

is well defined in Perl. $counter will hold the value 1 .

Finally, I would argue that, in most cases, counters are not necessary in Perl and code making extensive use of counter variables may need to be rewritten.

Sinan Ünür's user avatar

  • 2 When did rewritten turn into refactored , anyway? ☺ –  tchrist Nov 3, 2010 at 21:23

As far as I know, this is not possible (and shouldn't be, its even more dangerous than $[ ).

You can initialize your variables as follows to cut down on boilerplate:

or move initialization to a function:

Eric Strom's user avatar

  • 1 it is the index that arrays start from. it was there to make perl behave like awk by setting it to 1. it is deprecated and warns and only has file scope and must be set at compile time because of all the bugs it could cause in unrelated code. perldoc perlvar for more info –  Eric Strom Nov 3, 2010 at 20:09
  • Can't see why you would need to initialize something to zero at its point of declaration. –  tchrist Nov 3, 2010 at 22:37
  • @tchrist => If the first expression a variable is used in is a comparison or complex mathematical expression (think iterator sub with closed over state), you will get a warning on undef values. In those cases, initializing at the declaration site seems as good a place as any. –  Eric Strom Nov 4, 2010 at 0:23
  • well, perhaps. But I usually keep them at zero when I want to test them boolishly, or use ++ or += on, none of which requires such. –  tchrist Nov 4, 2010 at 0:35

No. Doing this can lead to some very scary and hard-to-decipher bugs, so it's not a good idea to change behaviour like this anyway.

In Perl, you can declare variables right when you need them for the first time, so there isn't generally a need to declare them first (with or without initialization) and then using them later. Additionally, operators such as ++ will work with undefined values equally well as zero, so you don't need to initialize counters at all:

However, I can insert a plug for Moose by mentioning that you can achieve automatic initialization of attributes in your Moose classes:

string has value: initial value number has value: 0

Ether's user avatar

Do you have a concrete reason for wanting to do this, or is it simply "because Code Complete says I should"?

If the former, please share the reason and we can discuss properly Perly ways to accomplish your real goal.

If the latter, please remember that Code Complete is a set of guidelines for programming in C, not Perl. Perl is not C and has its own set of strengths and weaknesses, which also means that it has a different set of... and I hate to use this phrase... best practices. Guidelines appropriate for one language do not necessarily apply to the other. "Always initialize variables (if possible) when you declare them" is a sound practice in C, but generally unnecessary in Perl.

Dave Sherohman's user avatar

Why would I not simply initialize a variable just prior to its first use?

Answer: I keep lots of "stock code" lying about. The whole "just declare it" when first used doesn't really work with my production style. I might write something where "my $FirstVariable = 0;" is included as the first line of a code block; then later, insert something above that block that also references $FirstVariable. Now I've broken my routine by moving blocks of code about.

If I "declare" all the variables I'll be needing at the top, then I don't have to worry about rearranging blocks of code killing the procedure because of preceeding necessary variable declarations.

Also, if the variable declarations are at the top, I can easily comment one declaration out, then try running the routine. This can be handy for troubleshooting large, complex programs because it will produce a list, complete with line numbers, of every place that the [now undeclared] variable is referenced. When I'm done looking everything over, I just delete the comment character and I'm back in operation!

Larry's user avatar

Your Answer

Reminder: Answers generated by artificial intelligence tools are not allowed on Stack Overflow. Learn more

Sign up or log in

Post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged perl variables initialization or ask your own question .

  • The Overflow Blog
  • Introducing Staging Ground: The private space to get feedback on questions...
  • Featured on Meta
  • The 2024 Developer Survey Is Live
  • The return of Staging Ground to Stack Overflow
  • The [tax] tag is being burninated
  • Policy: Generative AI (e.g., ChatGPT) is banned

Hot Network Questions

  • How do you keep the horror spooky when your players are a bunch of goofballs?
  • A man is kidnapped by his future descendants and isolated his whole life to prevent a bad thing; they accidentally undo their own births
  • What scientific evidence there is that keeping cooked meat at room temperature is unsafe past two hours?
  • Do we know how the SpaceX Starship stack handles engine shutdowns?
  • Can I travel with my child to the UK if I am not the person named in their visitor's visa?
  • Are there any jobs that are forbidden by law to convicted felons?
  • How do I tell which kit lens option is more all-purpose?
  • TeX capacity exceeded, sorry [grouping levels=255] while using enumerate
  • I'm looking for a series where there was a civilization in the Mediterranean basin, which got destroyed by the Atlantic breaking in
  • My players think they found a loophole that gives them infinite poison and XP. How can I add the proper challenges to slow them down?
  • How might a physicist define 'mind' using concepts of physics?
  • Calculate the volume of intersection sphere and cone using triple integral
  • Has ever a country by its own volition refused to join United Nations, or those which havent joined it's because they aren't recognized as such by UN?
  • Asterisk in violin sheet music
  • Where do UBUNTU_CODENAME and / or VERSION_CODENAME come from?
  • An application of the (100/e)% rule applied to postdocs: moving on from an academic career, perhaps
  • What legal reason, if any, does my bank have to know if I am a dual citizen of the US?
  • Calculation of centrifugal liquid propellant injectors
  • Times New Roman Ligatures are not working in Overleaf
  • Could a 200m diameter asteroid be put into a graveyard orbit and not be noticed by people on the ground?
  • Which program is used in this shot of the movie "The Wrong Woman"
  • Do features with high variance contribute more to top principal components that explain much of the variance in dataset and vice versa?
  • Find characters common among all strings
  • Can we make our 3 pm international flight at JFK if we do the Statue of Liberty Tour at 9 am?

perl invalid initialization by assignment

 »   »   » Init multiple variables -->

How do I initialize multiple variables in one statement?

How do I initialize multiple variables without having to manually specify each one?

The problem

The solution.

IMAGES

  1. Perl Commands

    perl invalid initialization by assignment

  2. Perl Commands

    perl invalid initialization by assignment

  3. Perl命令|学习最有用的Perl命令在2020年

    perl invalid initialization by assignment

  4. Perl Commands

    perl invalid initialization by assignment

  5. Scalar and List Contexts in Perl

    perl invalid initialization by assignment

  6. Project #6 Perl This assignment has two parts. Each

    perl invalid initialization by assignment

VIDEO

  1. Arrays Manipulation

  2. Variable Declaration, Initialization, Assignment Lecture

  3. Beginner Perl Maven tutorial 4.2

  4. 9 C plus plus Variables , initialization and assignment

  5. C++ Tutorial: Variable Declaration, Initialization & Assignment

  6. 2. Var vs Let vs Const interview questions 🤔

COMMENTS

  1. Array initialization in Perl

    Array assignment in Perl. 1. Initialize perl array elements if it is not defined or empty. 1. Perl array initialized incorrectly. 1. strange behavior of array declaration in perl. 0. Initialize multidimensional array. 1. Input array in constructor - Perl. Hot Network Questions Output the inventory sequence

  2. Hashes in Perl

    In this article of the Perl Tutorial we are going to learn about hashes, one of the powerful parts of Perl. Some times called associative arrays, dictionaries, or maps; hashes are one of the data structures available in Perl. A hash is an un-ordered group of key-value pairs. The keys are unique strings. The values are scalar values.

  3. Perl Array

    A list is immutable so you cannot change it directly. In order to change a list, you need to store it in an array variable.. By definition, an array is a variable that provides dynamic storage for a list. In Perl, the terms array and list are used interchangeably, but you have to note an important difference: a list is immutable whereas an array is mutable.

  4. Comprehensive Guide to Initialize Arrays in Perl

    This article aims to provide a comprehensive guide on how to initialize arrays in Perl, along with tips, tricks, and common errors. Initializing Arrays in Perl; Direct Assignment: In Perl, arrays can be initialized directly by assigning values to them. An array is defined by the '@' symbol followed by the array name. The assignment operator ...

  5. Arrays: A Tutorial/Reference

    Arrays: A Tutorial/Reference. Array is a type of Perl variable. An array variable is an ordered collection of any number (zero or more) of elements . Each element in an array has an index which is a non-negative integer. Perl arrays are (like nearly everything else in the language) dynamic: they grow as necessary, without any need for explicit ...

  6. The Essential Guide to Perl Variable

    To manipulate data in your program, you use variables. Perl provides three types of variables: scalars, lists, and hashes to help you manipulate the corresponding data types including scalars, lists, and hashes.. You'll focus on the scalar variable in this tutorial.

  7. Perl hash basics: create, update, loop, delete and sort

    This article describes the main functions and syntax rules for for working with hashes in Perl. Declaration and initialization. A hash is an unsorted collection of key value pairs. Within a hash a key is a unique string that references a particular value. A hash can be modified once initialized. Because a hash is unsorted, if it's contents ...

  8. Perl Hash

    A Perl hash is defined by key-value pairs. Perl stores elements of a hash in such an optimal way that you can look up its values based on keys very fast. With the array, you use indices to access its elements. However, you must use descriptive keys to access hash element. A hash is sometimes referred to as an associative array.

  9. 10 examples of initializing a Hash variable in Perl

    my %h1=@arr; This is actually a way of converting an array to a hash. Just by assigning an array to a hash gets an array converted to a hash. From the array, every 2 values are taken and formed a key-value pair for the hash. 6. Using the map function to construct hash: my @arr = ("Unix","Linux","AIX");

  10. perldata

    Variable names. Perl has three built-in data types: scalars, arrays of scalars, and associative arrays of scalars, known as "hashes". A scalar is a single string (of any size, limited only by the available memory), number, or a reference to something (which will be discussed in perlref ). Normal arrays are ordered lists of scalars indexed by ...

  11. Perl Hash

    Perl Hash table tutorial. A hash in Perl always starts with a percentage sign: %.When accessing an element of a hash we replace the % by a dollar sign $ and put curly braces {} after the name. Inside the curly braces we put the key.. A hash is an unordered set of key-value pairs where the keys are unique.. A key can be any string including numbers that are automatically converted to strings.

  12. How To Perform Perl Hash Manipulation For Data Handling

    Hashes, with their key-value pairs, are core to Perl's data structures. Let's delve into how to create and initialize them effectively. Basic Hash Creation; Initialization Using Lists; Using References; Basic Hash Creation. To create a hash in Perl, you make use of the percent symbol (%). The key and value within a pair are connected by the fat ...

  13. The Perl Language (Modern Perl 2011-2012)

    The Perl Language. Like a spoken language, the whole of Perl is a combination of several smaller but interrelated parts. Unlike spoken language, where nuance and tone of voice and intuition allow people to communicate despite slight misunderstandings and fuzzy concepts, computers and source code require precision.

  14. perl

    The use of $[has been discouraged, deprecated, and all but disallowed.See it in perlvar (it is in Deprecated and Removed Variables section †) and see the core arybase where it's been moved.. Still, if you must, you can disable this particular warning category (Update: only in pre-v5.30) †. use strict; use warnings; # Restrict the scope as much as possible { no warnings 'deprecated'; $[ = 1 ...

  15. Variable declaration in Perl

    Author: Gabor Szabo Gabor who runs the Perl Maven site helps companies set up test automation, CI/CD Continuous Integration and Continuous Deployment and other DevOps related systems.. Gabor can help refactor your old Perl code-base. He runs the Perl Weekly newsletter.. Contact Gabor if you'd like to hire his service.. Buy his eBooks or if you just would like to support him, do it via Patreon.

  16. Perl

    Perl is a general purpose, high level interpreted and dynamic programming language. Perl was originally developed for the text processing like extracting the required information from a specified text file and for converting the text file into a different form. Perl supports both the procedural and Object-Oriented programming. Perl is a lot similar

  17. Static and state variables in Perl

    Static and state variables in Perl. In most of the cases we either want a variable to be accessible only from inside a small scope, inside a function or even inside a loop. These variables get created when we enter the function (or the scope created by a a block) and destroyed when we leave the scope. In some cases, especially when we don't ...

  18. How can I automatically initialize all the scalar variables in Perl?

    In Perl, you can declare variables right when you need them for the first time, so there isn't generally a need to declare them first (with or without initialization) and then using them later. Additionally, operators such as ++ will work with undefined values equally well as zero, so you don't need to initialize counters at all:

  19. How do I initialize multiple variables in one statement?

    How to initialize multiple variables in one statement