Some Important Functions in NumPy to Generate Random Number

When creating a dummy dataset, it is sometimes necessary to use random numbers, but how can this be done with NumPy?

Photo by Jonathan Petersson from Pexels

Data science is a discipline that requires data. When you are just starting to learn data science, using real-world datasets can be difficult. Creating a dummy dataset is the solution to that problem. Dummy datasets are easier to make if we use random numbers.

When creating random numbers, sometimes there are specific criteria that we set as the range of the numbers. Creating random numbers is easier to do with the NumPy library. NumPy provides several functions that can generate random numbers according to the criteria we want.

In this article, we will discuss some essential functions in NumPy that can be used as random number generators.

np.random.random()

When you want to generate random numbers with half-open intervals [0.0 to 1.0), then you can use np.random.random(). It returns a float number between zero and one.

np.random.random(size=None)

This function only requires one argument as the number of random numbers to be generated. You can specify the number of random numbers to generate by writing it in parentheses.

Image by Author

np.random.rand()

Next, there is the np.random.rand() function that can be used to generate uniformly distributed random numbers [0.0 to 1.0).

np.random.rand(size=None)

Just like the previous function, this function also only requires one argument in the form of the size of the number of random numbers to be generated.

Image by Author

np.random.randn()

When you want to create normally distributed random numbers, you can use the np.random.randn() function.

np.random.randn(size=None)

Just like the previous function, this function also only requires one argument in the form of the size of the number of random numbers to be generated. This function will return a normally distributed float value with a mean of 0 and a standard deviation of 1.

Image by Author

We can modify the mean and standard deviation values as we want. To modify it, you have to multiply the standard deviation value by the random value, then add the result with the random value (sigma * np.random.randn() + mu).

Image by Author

In the example above, we created a normally distributed random value with a mean of 10 and a standard deviation of 3.

np.random.randint()

When you want to create an integer random value that can have a lower and upper bound, use np.random.randint(). There are several parameters in this function, including lower bound (inclusive), upper bound (exclusive), size, and specific data type.

np.random.randint(low, high=None, size=None, dtype=int)

This function will return a random integer value (discrete uniform distribution) from the specified limit.

Image by Author

In most situations, this random function is often used to create dummy datasets.

np.random.normal()

Creating normally distributed random numbers can be done with (sigma * np.random.randn() + mu). However, NumPy makes it easy to do so by using the np.random.normal() function.

np.random.normal(loc=0.0, scale=1.0, size=None)

This function has 3 parameters, including loc which is the mean (center) value, scale which is the standard deviation (spread) value, and size.

Image by Author

np.random.uniform()

Then, to create uniformly distributed random values, you can use the np.random.uniform() function. This function has three parameters, including lower bound, upper bound, and size.

np.random.uniform(low=0.0, high=1.0, size=None)

By default, this function will return values that are uniformly distributed on a half-open interval [low, high) (including low, but excluding high).

Image by Author

np.random.choice()

Finally, the np.random.choice() function can be used to generate random samples from a given 1-D array. This function has four parameters, including the 1-D array, size, replace (True or False) and the probability associated with each entry in a given 1-D array.

np.random.choice(1-D array, size=None, replace=True, p=None)

This function can also be utilized in random sampling or with a predetermined probability.

Image by Author

Conclusion

NumPy provides lots of functions that can be used to generate random numbers. However, out of these many functions, there are only a few that are frequently used. It is not necessary to understand all of them, but understanding how some of them work is enough. For more details, you can read the official NumPy documentation.

References:

[1] Legacy Random Generation — NumPy v1.23 Manual. Numpy.org. (2022). Retrieved 14 October 2022, from https://numpy.org/doc/1.23/reference/random/legacy.html.

Dede Kurniawan
Dede Kurniawan

Written by Dede Kurniawan

A writer who focuses on the topics of Python 🐍, Data Science 📊, and Biology 🧬. My LinkedIn: https://www.linkedin.com/in/dede-kurniawann/

No responses yet

Write a response