Content is user-generated and unverified.

Arrays in R

What is an Array?

An array is a multi-dimensional data structure that can have more than two dimensions (unlike matrices which are strictly two-dimensional). Think of arrays as extending the concept of matrices to three, four, or more dimensions. All elements in an array must be of the same data type.

Key characteristics:

Multi-dimensional (can have 3, 4, 5, or more dimensions)
All elements must be the same data type (homogeneous)
Elements are arranged in a rectangular structure across multiple dimensions
Each element can be accessed by specifying its position in each dimension

Dimensional analogy:

Vector: 1-dimensional (a line of data)
Matrix: 2-dimensional (a table with rows and columns)
Array: 3+ dimensional (multiple tables stacked together, or higher dimensions)

Creating Arrays

Using the `array()` function:

# Create a 3-dimensional array: 2×3×4 (2 rows, 3 columns, 4 layers)
data_vector <- 1:24
my_array <- array(data_vector, dim = c(2, 3, 4))
my_array
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9   11
[2,]    8   10   12

, , 3

     [,1] [,2] [,3]
[1,]   13   15   17
[2,]   14   16   18

, , 4

     [,1] [,2] [,3]
[1,]   19   21   23
[2,]   20   22   24

Creating arrays with different dimensions:

# 4-dimensional array: 2×2×2×3
four_d_array <- array(1:24, dim = c(2, 2, 2, 3))
four_d_array
, , 1, 1

     [,1] [,2]
[1,]    1    3
[2,]    2    4

, , 2, 1

     [,1] [,2]
[1,]    5    7
[2,]    6    8

, , 1, 2

     [,1] [,2]
[1,]    9   11
[2,]   10   12

# ... and so on

# Simple 3D array with specific values
simple_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2))
simple_array
, , 1

     [,1] [,2]
[1,]    1    3
[2,]    2    4

, , 2

     [,1] [,2]
[1,]    5    7
[2,]    6    8

Creating arrays from matrices:

# Create multiple matrices and combine them into an array
matrix1 <- matrix(1:6, nrow = 2, ncol = 3)
matrix2 <- matrix(7:12, nrow = 2, ncol = 3)
matrix3 <- matrix(13:18, nrow = 2, ncol = 3)

# Combine matrices into a 3D array
combined_array <- array(c(matrix1, matrix2, matrix3), dim = c(2, 3, 3))
combined_array
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9   11
[2,]    8   10   12

, , 3

     [,1] [,2] [,3]
[1,]   13   15   17
[2,]   14   16   18

Understanding Array Structure

Checking array properties:

my_array <- array(1:24, dim = c(2, 3, 4))

# Check if it's an array
is.array(my_array)
[1] TRUE

# Get dimensions
dim(my_array)
[1] 2 3 4

# Get number of dimensions
length(dim(my_array))
[1] 3

# Check data type
class(my_array)
[1] "array"

typeof(my_array)
[1] "integer"

# Get structure overview
str(my_array)
 int [1:2, 1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ...

# Total number of elements
length(my_array)
[1] 24

Understanding dimensions:

# For a 3D array with dim = c(2, 3, 4):
# - 1st dimension: 2 rows
# - 2nd dimension: 3 columns  
# - 3rd dimension: 4 layers/slices

# Think of it as 4 matrices, each with 2 rows and 3 columns

Accessing Array Elements (Indexing)

Individual elements:

my_array <- array(1:24, dim = c(2, 3, 4))

# Access element at row 1, column 2, layer 3
my_array[1, 2, 3]
[1] 15

# Access element at row 2, column 1, layer 1
my_array[2, 1, 1]
[1] 2

Slicing arrays (getting subsets):

# Get entire layer (matrix) 2
my_array[, , 2]
     [,1] [,2] [,3]
[1,]    7    9   11
[2,]    8   10   12

# Get row 1 from all layers
my_array[1, , ]
     [,1] [,2] [,3] [,4]
[1,]    1    7   13   19
[2,]    3    9   15   21
[3,]    5   11   17   23

# Get column 2 from all layers
my_array[, 2, ]
     [,1] [,2] [,3] [,4]
[1,]    3    9   15   21
[2,]    4   10   16   22

# Get specific rows and columns from layer 3
my_array[1:2, 1:2, 3]
     [,1] [,2]
[1,]   13   15
[2,]   14   16

# Get multiple layers
my_array[, , c(1, 3)]
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 3

     [,1] [,2] [,3]
[1,]   13   15   17
[2,]   14   16   18

Working with 4D arrays:

four_d <- array(1:48, dim = c(2, 3, 2, 4))

# Access single element: row 1, col 2, layer 1, "book" 3
four_d[1, 2, 1, 3]
[1] 27

# Get entire 3D "slice" from the 4th dimension
four_d[, , , 2]
, , 1

     [,1] [,2] [,3]
[1,]   13   15   17
[2,]   14   16   18

, , 2

     [,1] [,2] [,3]
[1,]   19   21   23
[2,]   20   22   24

Adding Names to Array Dimensions

Naming dimensions:

my_array <- array(1:24, dim = c(2, 3, 4))

# Add dimension names
dimnames(my_array) <- list(
  c("Row1", "Row2"),                    # 1st dimension (rows)
  c("Col1", "Col2", "Col3"),           # 2nd dimension (columns)
  c("Layer1", "Layer2", "Layer3", "Layer4")  # 3rd dimension (layers)
)

my_array
, , Layer1

     Col1 Col2 Col3
Row1    1    3    5
Row2    2    4    6

, , Layer2

     Col1 Col2 Col3
Row1    7    9   11
Row2    8   10   12

# ... and so on

# Access by name
my_array["Row1", "Col2", "Layer3"]
[1] 15

my_array[, , "Layer1"]
     Col1 Col2 Col3
Row1    1    3    5
Row2    2    4    6

Setting names during creation:

named_array <- array(1:12, 
                    dim = c(2, 3, 2),
                    dimnames = list(
                      Subjects = c("Person1", "Person2"),
                      Tests = c("Math", "Science", "English"),
                      Time = c("Before", "After")
                    ))
named_array
, , Time = Before

        Tests
Subjects Math Science English
  Person1    1       3       5
  Person2    2       4       6

, , Time = After

        Tests
Subjects Math Science English
  Person1    7       9      11
  Person2    8      10      12

Modifying Arrays

Changing individual elements:

my_array <- array(1:24, dim = c(2, 3, 4))

# Change single element
my_array[1, 2, 3] <- 999
my_array[1, 2, 3]
[1] 999

# Change entire layer
my_array[, , 1] <- matrix(c(100, 200, 300, 400, 500, 600), nrow = 2, ncol = 3)
my_array[, , 1]
     [,1] [,2] [,3]
[1,]  100  300  500
[2,]  200  400  600

# Change all elements in specific positions across all layers
my_array[1, 1, ] <- c(10, 20, 30, 40)
my_array[1, 1, ]
[1] 10 20 30 40

Array Arithmetic

Element-wise operations:

array1 <- array(1:8, dim = c(2, 2, 2))
array2 <- array(9:16, dim = c(2, 2, 2))

# Addition
result_add <- array1 + array2
result_add
, , 1

     [,1] [,2]
[1,]   10   14
[2,]   12   16

, , 2

     [,1] [,2]
[1,]   18   22
[2,]   20   24

# Multiplication (element-wise)
result_mult <- array1 * array2
result_mult
, , 1

     [,1] [,2]
[1,]    9   35
[2,]   20   48

, , 2

     [,1] [,2]
[1,]   65   91
[2,]   78  112

# Operations with single values
array1 + 10
, , 1

     [,1] [,2]
[1,]   11   13
[2,]   12   14

, , 2

     [,1] [,2]
[1,]   15   17
[2,]   16   18

Useful Array Functions

Apply functions across dimensions:

my_array <- array(1:24, dim = c(2, 3, 4))

# Apply function across different dimensions
# MARGIN = 1: across rows (within columns and layers)
apply(my_array, 1, sum)  # Sum each row across all columns and layers
[1] 156 168

# MARGIN = 2: across columns (within rows and layers)  
apply(my_array, 2, sum)  # Sum each column across all rows and layers
[1] 52 68 84

# MARGIN = 3: across layers (within rows and columns)
apply(my_array, 3, sum)  # Sum each layer across all rows and columns
[1] 21 57 93 129

# MARGIN = c(1,2): preserve rows and columns, sum across layers
apply(my_array, c(1, 2), sum)
     [,1] [,2] [,3]
[1,]   40   48   56
[2,]   44   52   60

# Apply custom functions
apply(my_array, 3, mean)  # Mean of each layer
[1]  3.5  9.5 15.5 21.5

apply(my_array, c(1, 2), max)  # Maximum across layers for each row/column combo
     [,1] [,2] [,3]
[1,]   19   21   23
[2,]   20   22   24

Other useful functions:

# Get margins (dimension sizes)
margins <- dim(my_array)
margins
[1] 2 3 4

# Convert to vector (flattened)
as.vector(my_array)
[1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

# Get dimension names
dimnames(my_array)
NULL  # (if no names were set)

Practical Examples

Example 1: Student test scores over time

# Scores for 3 students, 4 subjects, across 2 semesters
scores <- array(
  data = c(85, 90, 78,    # Math, Semester 1
           88, 92, 80,    # Science, Semester 1  
           82, 87, 75,    # English, Semester 1
           79, 85, 72,    # History, Semester 1
           87, 93, 80,    # Math, Semester 2
           90, 95, 83,    # Science, Semester 2
           84, 89, 77,    # English, Semester 2
           81, 87, 74),   # History, Semester 2
  dim = c(3, 4, 2),
  dimnames = list(
    Students = c("Alice", "Bob", "Carol"),
    Subjects = c("Math", "Science", "English", "History"),
    Semester = c("Fall", "Spring")
  )
)

# View all data
scores

# Get Alice's scores across all subjects and semesters
scores["Alice", , ]

# Get Math scores for all students in Spring semester
scores[, "Math", "Spring"]

# Calculate average score per student across all subjects and semesters
apply(scores, 1, mean)

Example 2: Image data (RGB values)

# Create a simple 4×4 pixel image with RGB channels
# Dimensions: height × width × color_channels
image_data <- array(
  data = sample(0:255, 48, replace = TRUE),  # Random pixel values
  dim = c(4, 4, 3),  # 4×4 pixels, 3 color channels (R, G, B)
  dimnames = list(
    Height = paste0("Row", 1:4),
    Width = paste0("Col", 1:4),
    Channel = c("Red", "Green", "Blue")
  )
)

# View red channel
image_data[, , "Red"]

# Get RGB values for pixel at row 2, column 3
image_data[2, 3, ]

# Calculate average intensity for each color channel
apply(image_data, 3, mean)

Converting Between Data Types

Array to matrix (for 2D arrays):

two_d_array <- array(1:12, dim = c(3, 4))
as.matrix(two_d_array)
     [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   12

Array to vector:

my_array <- array(1:8, dim = c(2, 2, 2))
as.vector(my_array)
[1] 1 2 3 4 5 6 7 8

Reshaping arrays:

# Change dimensions while keeping the same data
original <- array(1:24, dim = c(2, 3, 4))
reshaped <- array(original, dim = c(4, 6, 1))  # Reshape to 4×6×1
dim(reshaped)
[1] 4 6 1

Key Points to Remember

Arrays extend matrices - matrices are 2D, arrays can be 3D, 4D, or higher
All elements same type - homogeneous data structure
Dimensional indexing - use [dim1, dim2, dim3, ...] notation
Leave dimensions empty - use [, , 2] to get entire layer 2
apply() is powerful - specify MARGIN to control which dimensions to apply functions across
Dimension order matters - typically rows, columns, then higher dimensions
Memory considerations - arrays can get very large very quickly
Names improve readability - use dimnames() for better code clarity
Think in slices - 3D arrays are like stacks of matrices
Real-world applications - common in image processing, time series analysis, and scientific computing

Arrays are powerful for handling complex multi-dimensional data but can be memory-intensive. They're particularly useful in scientific computing, image processing, and when you need to organize data across multiple categorical dimensions simultaneously.