Big Data Challenges: Is HTTP GET Letting You Down?

A battle of complexity, correctness and practicality

Nov 26, 2024

Have you ever used the GET method to retrieve data?

Have you passed parameters via the query string to filter results?

Have you ever hit a wall when the data to be passed became too large or complex to handle?

If you have, you’re not alone. While GET is often recommended for data retrieval, there are scenarios where it becomes impractical. Here, I’ll share my journey navigating these challenges and what I learned along the way.

The Starting Problem

I started with a straightforward GET request to retrieve data. Here’s my initial implementation on the frontend:

generateResults() {
    this.generatingResults = true;
    apiFetch({
        url: '/api/result_wizard',
        method: 'GET',
    }).then(response => {
        //...
    });
}

On the backend, I had this simple Django view:

@require_http_methods(['GET'])
def result_wizard(request):
    results = _result_wizard()
    return ResourceResponse(results, status=200)

The results were then displayed on the front end. Initially, everything worked fine—until I realized I needed to pass additional parameters to filter the results.

The problem? My dataset, this.inputPools, wasn't small or simple.

Stages of Solutions

Passing Small Data via Query Parameters

For small datasets, converting the data into query parameters is a simple fix:

Frontend:

const params = new URLSearchParams(this.inputPools).toString();

const url = `/api/oligo_wizard?${params}`;

Backend:

@require_http_methods(['GET'])
def result_wizard(request):
    if request.method == 'GET':
        # Parse query parameters
        data = request.GET.dict()  # Converts query parameters into a dictionary

This approach works well for simple key-value pairs, but as the data structure grows, so do the challenges.

Handling Lists in Query Parameters

What if this.inputPools includes keys with lists as values? For example:

this.inputPools = {
    name: ['input1', 'input2'],
    strategy: 'custom',
    space: ['A', 'B'],
}

In this case, the frontend still uses the same method of URLSearchParams(this.inputPools).toString(). But request.GET.dict() won’t handle the lists correctly. Instead, I had to extract all query parameters, including lists explicitly:

# Extract all query parameters, including lists
data = {key: request.GET.getlist(key) for key in request.GET.keys()}

This approach captures multiple values for the same key as lists.

Passing a List to the Backend

What if this.inputPools itself is a list of dictionaries? Passing a list directly via GET is possible but requires serialization. Here are three approaches I explored:

Repeated Query Parameters

const params = new URLSearchParams();
this.inputPools.forEach((pool) => params.append('pool', pool));

// URL: /api/result_wizard?pool=item1&pool=item2&pool=item3

Backend:

data = request.GET.getlist('pool')
print(data)  # Output: ['item1', 'item2', 'item3']

2. JSON String

params.append('pools', JSON.stringify(this.inputPools));

// URL: /api/result_wizard?pools=["item1","item2","item3"]

Backend:

data = json.loads(request.GET.get('pools', '[]'))
print(data)  # Output: ['item1', 'item2', 'item3']

3. Comma-Separated String

params.append('pools', this.inputPools.join(',')); // Join list items with commas

// URL: /api/result_wizard?pools=item1,item2,item3

Backend:

data = request.GET.get('pools', '').split(',')
print(data)  # Output: ['item1', 'item2', 'item3']

But hang on.

While these approaches worked for simple lists, my real challenge was handling a list of dictionaries. Something like:

this.inputPools = [
    {"name": "input1", "feature": "space1"},
    {"name": "input2", "feature": "space2"}
]

Passing a List of Dictionaries

Now that my this.inputPools is a complex list of dictionaries, GET becomes unwieldy.

One option is still to serialize the data into a JSON string:

params.append('data', JSON.stringify(this.inputPools)); // Serialize the list of dictionaries
const url = /api/result_wizard?${params.toString()};

// URL: /api/result_wizard?data=%5B%7B%22name%22%3A%22input1%22%2C%22feature%22%3A%22space1%22%7D%2C%7B%22name%22%3A%22input2%22%2C%22feature%22%3A%22space2%22%7D%5D

Backend:

raw_data = request.GET.get('data', '')

# Parse the JSON string into a Python list of dictionaries
data = json.loads(raw_data)

This works, but the URLs quickly become long and ugly. Another option is custom key-value encoding to flatten the dictionaries, but this adds complexity and becomes inefficient:

const params = new URLSearchParams();
this.inputPools.forEach((item, index) => {
    Object.entries(item).forEach(([key, value]) => {
        params.append(`data[${index}][${key}]`, value);
    });
});

The backend would then have to parse the flattened structure.

At this point, constructing and parsing the query string became so cumbersome that GET started to feel impractical. Compression is another option, but even that felt like over-engineering.

When GET Becomes Too Complex

When you must compress or flatten large datasets to fit them into a GET request, it’s time to step back and reconsider. GET was never designed for handling large or complex data structures, and pushing its limits introduces unnecessary complexity and fragility.

Here are the key drawbacks of using GET for large datasets:

Ugly URLs: They become unreadable and challenging to debug.
URL Length Limits: Most browsers and servers limit 2,000–8,000 characters.
Encoding Overhead: Encoding and decoding large datasets defeats the simplicity of GET.

In this scenario, simplicity and correctness should take precedence: switch to POST and use the request body to send the data in a structured and secure manner.

Final Solution

In the end, my frontend looks like this:

generateResults() {
    this.generatingResults = true;
    apiFetch({
        url: '/api/result_wizard',
        method: 'POST',
        data: this.inputPools,
    }).then(response => {
        //...
    });
}

And backend unpacks the input:

@require_http_methods(['POST'])
def result_wizard(request):

    data = json.loads(request.body)
    results = _result_wizard(data)

    return ResourceResponse(results, status=200)

Level Up

Conveniently, during the development of this functionality, a request came in to include an option for downloading the data as an Excel file. This allowed me to reuse the majority of the code, and simply add a flag to indicate the intended purpose:

generateResults(action) {
    const postData = {
        pools: this.inputPools,
        action: action,
    };

    if (action === 'retrieve') {
        this.retrievingResults; // Loading indicator for retrieval
    } else {
        this.downloadingResults; //Loading indicator for donwloading
    }

    apiFetch({
        url: '/api/result_wizard',
        method: 'POST',
        data: postData,
    }).then(response => {
        if (action === 'retrieve') {
            // Handle display
        } else {
            // Handle download
        }
    }).finally(() => {
        this.retrievingResults = action === 'retrieve' ? false : this.retrievingResults;
        this.downloadingResults = action === 'download' ? false : this.downloadingResults;
    });
}

Backend:

@require_http_methods(['POST'])
def result_wizard(request):

    data = json.loads(request.body)
    action = data.pop('action', None)

    results = _result_wizard(data)

    if action == 'download':
        return generate_download_response(results)

    return ResourceResponse(results, status=200)

Final Thoughts

As programmers, we often strive to follow best practices, like using GET for retrieval. But sometimes, the complexity of a situation forces us to adapt. When datasets become large or intricate, POST provides the simplicity, flexibility, and security we need.

In the end, the time spent exploring and understanding the limitations of GET wasn’t wasted. It led to a deeper understanding of how to balance practicality with best practices. After all, the simplest solution is often the best.

Build to Launch

Discussion about this post