Often we need to create data in NumPy arrays and convert it to DataFrame because we need to deal with pandas methods.

In this case, convert theArrays NumPy(ndarrays) apandas dataframemakes our data analysis convenient. In this tutorial, we'll take a closer look at some of the common approaches we can use to convert NumPy array to Panda DataFrame.

We'll also look at some common tricks for handling various NumPy array data structures that have values other than Panda's DataFrame.

table of Contents

## Creating NumPy arrays (ndarrays)

NumPy arrays are multidimensional arrays, they can store homogeneous or heterogeneous data.

There are several ways that we can create a NumPy array.

Method 1: Usearrange ()Method: A series of values is created according to the given parameter, starting from zero. Here is a code snippet showing how to use it.

importiere numpy als nparry = np.arange(20)print(arry)

**Salida**

This is a one dimensional array.

Method 2: UseListand numpy.array(): In this technique, we use the numpy.array() method and pass the list to convert it to an array. Here is a code snippet showing how to use it.

import numpy as npli = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]arry = np.array(li)print(arry)

**Salida**

But for DataFrame we need a two dimensional array. To create a two-dimensional array, we have two different approaches:

With range() andreform(): We can use these two methods one after the other to generate a set of values and put them in the correct form. Here is a code snippet showing how to use it.

importiere numpy als nparry = np.arange(24).reshape(8,3)print(arry)

**Salida**

Using list and numpy.array(): In this technique, we use the numpy.array() method and pass a nested list to convert it to an array. Here is a code snippet showing how to use it.

import numpy as npli = [[10, 20, 30, 40], [42, 52, 62, 72]]arry = np.array(li)print(arry)

**Salida**

## Converting a homogeneous NumPy array (ndarrays) using the DataFrame constructor

A DataFrame in Pandas is a two-dimensional collection of data in rows and columns. Stores homogeneous and heterogeneous data.

We need to use the DataFrame() constructor to create a DataFrame from a NumPy array. Here is a code snippet showing how to use it.

import numpy as npimport pandas as pdli = [[10, 20, 30, 40], [42, 52, 62, 72]]arry = np.array(li)dataf = pd.DataFrame(arry)print(dataf)print ()print(type(dataf))

**Salida**

## Add column name and index to converted data frame

We can use the column and index parameters in DataFrame() to determine the column names and index labels for the DataFrame.

By default, the column and index values start at 0 and increment by 1. Here's an example of a DataFrame specifying the columns and index.

import numpy as npimport pandas as pdli = [[10, 20, 30, 40], [42, 52, 62, 72]]arry = np.array(li)dataf = pd.DataFrame(arry, index = ['R1 ', 'R2'], Spalten = ['ColA', 'ColB', 'ColC', 'ColD'])print(datosf)print()print(type(datosf))

**Salida**

## Convert a heterogeneous NumPy array to a DataFrame

We can also create a DataFrame from a NumPy array containing heterogeneous values as a nested list.

We can pass the ndarrays object to the DataFrame() constructor and set the column values to create a DataFrame with a heterogeneous data value.

Here is an example of a DataFrame with heterogeneous data.

import numpy as npimport pandas as pdarry = np.array([[25, 'Karlos', 2015], [21, 'Gaurav', 2016], [22, 'Dee', 2018]], dtype = object)df = pd.DataFrame(arry, column = ['Age', 'Student_Name', 'Passing Year'] , index = [1, 2, 3])print(df)

**Salida**

## Create DataFrame from NumPy array by columns

This is another approach to creating a DataFrame from a NumPy array, using the full two-dimensional column-by-column indexing mechanism of ndarrays.

It works similar to the main column in the general matrix. Here is an example showing how to use it.

import numpy as npimport pandas as pdarry = np.array([[10, 20, 30, 40], [15, 18, 20, 23], [51, 42, 33, 24]])print(arry, "\ n")myDat = pd.DataFrame({'col_1': arry[:, 0], # Create pandas DataFrame 'col_2': arry[:, 1], 'col_3': arry[:, 2], 'col_4' : arry[:, 3]})print(myDat)

**Salida**

## Create DataFrame from NumPy array by rows

Here's another approach to creating a DataFrame from a NumPy array, using the full two-dimensional row-by-row indexing mechanism of ndarrays. It works similar to the parent row in the general matrix. Here is an example showing how to use it.

import numpy as npimport pandas as pdarry = np.array([[10, 20, 30, 40], [15, 18, 20, 23], [51, 42, 33, 24]])print(arry, "\ n")myDat = pd.DataFrame({'row_1': arry[0, :], # Create pandas DataFrame 'row_2': arry[1, :], 'row_3': arry[2, :]}, index = ['Spalte1', 'Spalte2', 'Spalte3', 'Spalte4'])imprimir(myDat)

**Salida**

## Concatenate NumPy array to Panda dataframe

We can also concatenate NumPy arrays with Panda's DataFrame by creating one DataFrame (via ndarray) and merging it with the other using the equality operator. Here is a code snippet showing how to implement it.

import numpy as npimport pandas as pdary = np.array([['India', 91], ['USA', 1], ['France', 33]], dtype = object)print(ary)print(type(; ario), "\n")df = pd.DataFrame(ario, column = ['CountryName', 'PhoneCode'])arr1 = np.array([['Jio'], ['Airtel'], [ 'AT&T']], dtype=object) df2 = pd.DataFrame(arr1, column = ['Brand']) df['Brand_Name'] = df2['Brand']print(df);

**Salida**

## Add NumPy array as new column inside DataFrame

We can also embed a 2D NumPy array directly into a Pandas DataFrame. To do this, we need to convert a nested list to a Panda DataFrame and assign it to the existing DataFrame's column with a column name.

Here is a code snippet showing how to add a new column based on a NumPy array directly with a column name.

import numpy as npimport pandas as pddf = pd.DataFrame(np.arange(4, 13).reshape(3, 3))df['New_Col'] = pd.DataFrame(np.array([[2], [4 ], [6]]))borracho(df)

**Salida**

## matriz NumPy a DataFrame con concat ()

concat() is another powerful Pandas method to concatenate two DataFrames into a new one. We can use the concat() method to concatenate a new DataFrame with a NumPy array.

The syntax is: pandas.concat([dataframe1, pandas.DataFrame(ndarray)], axis = 1) Here is the code snippet showing how to implement it.

import numpy as npimport pandas as pddf = pd.DataFrame({'value1': [25, 12, 15, 14, 19], 'value2': [52, 17, 12, 9, 41], 'value3': [ 10, 30, 15, 11, 14]})nuevaArr = np.matriz([[12, 13], [11, 10], [22, 17], [18, 27], [31, 14]]) new_df = pd.concat([df, pd.DataFrame(newArr)], eje = 1)print(new_df)

**Salida**

## Convert NumPy Array to DataFrame with random.rand() and reshape()

We can generate some random numbers (using random.rand()) and reshape the entire object into a two-dimensional NumPy array format using reshape().

Then we can convert it to a DataFrame. Here is a code snippet showing how to implement it.

import numpy as npimport pandas as pdarry = np.random.rand(8).reshape(2, 4)print("Numpy array:")print(arry)# convertir numpy array a dataframedf = pd.DataFrame(arry, column = ['C1', 'C2', 'C3', 'C4'])imprimir("\n Pandas DataFrame: ")imprimir(df)

**Salida**

## Adding the NumPy Array to the Panda DataFrame with tolist()

We can also use NumPy's tolist() method to get an entire NumPy array and place it as part of the DataFrame column.

The syntax looks like this: dataframe_object['column_name'] = ndarray_object.tolist() Here is a code snippet showing how to use it.

import numpy as npimport pandas as pddf = pd.DataFrame({'value1': [25, 12, 15, 14, 19], 'value2': [52, 17, 12, 9, 41], 'value3': [ 10, 30, 15, 11, 14]})nuevo = np.array([3, 7, 1, 0, 5])df['Newcol'] = nuevo.tolist()print(df)

**Salida**

## Creating DataFrames by np.zeros()

We can also create a DataFrame by implementing numpy.zeros(). Those ndarrays are all null values and are also used to create the DataFrame.

Here is a code snippet showing how to implement it.

import numpy as npimport pandas as pdarry = pd.DataFrame(np.zeros((5, 3)))print("Numpy array:")print(arry)df = pd.DataFrame(arry, column = ['C1', 'C2', 'C3'])df = df.fillna(0)imprimir("\n Pandas DataFrame: ")imprimir(df)

**Salida**

## Creating DataFrames with random.choice() from NumPy array

Another way to create a NumPy array from a DataFrame is to use random.choice() and place it in the DataFrame() constructor to directly convert the NumPy array of a given size to a DataFrame. Here is a script showing how to implement it.

import numpy as npimport pandas as pddf = df = pd.DataFrame(np.random.choice(12, (3, 4)), column = list('ABCD'))print("\n Pandas DataFrame: ")print( f)

**Salida**

## Transpose a NumPy array before creating a DataFrame

We can create a transpose of a NumPy array and place it in a DataFrame. Here's a code example showing how to implement it.

importar numpy como npimport pandas como pdarry = np.array([[4, 8], [15, 18], [18, 21], [13, 19], [10, 15], [7, 12], [ 4, 2], [5, 1], [8, 4], [9, 24], [23, 35], [10, 22], [12, 27]]) ary_tp = ary.transpose()print (arry_tp)imprimir()df = pd.DataFrame({'col1': arry_tp[0], 'col2': arry_tp[1]})imprimir(df.tail())

**Salida**

## Create an empty DataFrame from an empty NumPy array

We can create an empty data frame from a NumPy array that stores NaN values (not a number). Here is a code snippet showing how to implement it.

importar pandas como pdimport numpy como npdf = pd.DataFrame(np.nan, index = [0, 1, 2], column = ['A', 'B', 'C', 'D'])df = df. fillna(' ')imprimir(df)

**Salida**

## Generating DataFrame through iterations of NumPy arrays

We can do an implicit iteration as a list comprehension inside the DataFrame() constructor, which can use the NumPy array to iterate over the ndarray elements based on the shape().

Ultimately, you can provide us with a DataFrame of the ndarray. Here is a script that shows how to run it.

importar pandas como pdimport numpy como nparry = np.array([[2, 4, 6], [10, 20, 30]])df = pd.DataFrame(data = arry[0:, 0:], index = [ 'Row-' + str(g + 1) for g in range(arry.shape[0])], column=['Column-' + str(g + 1) for g in range(arry.shape[1] ) ])borracho(df)

**Salida**

## NumPy array display and Pandas DataFrame

We can also visualize NumPy array or DataFrame using data visualization libraries likematplotlib. Here is a code snippet showing how to implement it.

import numpy as npimport pandas as pdimport matplotlib.pyplot as pltarry = np.array(np.random.random((30, 3)))print(arry)# Scatterplot for NumPy Arrayplt.scatter(arry[:, 0], arry [:, 1], c = matriz[:, 2], cmap = 'Púrpura')df = pd.DataFrame(arry, columna = ['C1', 'C2', 'C3'])print(df) # Diagrama de dispersión para DataFrame con dos valores de columna plt.scatter(df["C1"], df["C2"])plt.show()

**Salida**

**Salida**

## Diploma

Converting data from a NumPy array to a DataFrame has become an important task and the daily work of a data scientist.

Often we need to work with DataFrame instead of NumPy arrays. Data conversion plays an essential role in order to use the Pandas library to its full extent.

We hope this tutorial has given you a clear idea of how to create and convert NumPy arrays (ndarrays) to DataFrame.

This tutorial has highlighted how to convert a homogeneous ndarray using the DataFrame constructor, add indices to a converted DataFrame, convert a heterogeneous NumPy array to a DataFrame, and create a DataFrame from a NumPy array by rows and columns.

We also found techniques like concatenating NumPy arrays with Panda's DataFrame and adding NumPy arrays as a new column inside a DataFrame and using the concat() methods.

Finally, we come across creating an empty DataFrame from an empty Ndarray and displaying the NumPy array and the pandas DataFrame.

## Other reading

For more information, see the official docs:

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html

Gaurav Krroy

Gaurav is a Full-Stack (sr.) Technology Content Engineer (6.5 years experience) and has a deep passion for curating articles, blogs, eBooks, tutorials, infographics and other web content. Other than that, he is engaged in security research and has found bugs for many governments. and private companies around the world. He is the author of two books and has contributed to more than 500 articles and blogs. He is a computer science educator and loves spending time on lean programming, data science, privacy, and SEO. In addition to writing, he loves to play table football, read novels, and dance.