LRU Cache Simulation Heap Overflow Problem

Hi, I'm new to C and are working on LRU Cache Simulation recently. However, when testing testcase w/ very large amount of data, it shows up that heap-buffer overflow. I've tried to use Valgrind to debug, but still can't locate where the bug is. The source code is shown below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>

typedef struct Node Node;
typedef struct Node{
    Node *prev;
    Node *next;
    int key;
    int val;
} Node;

typedef struct{
    // only need pointers to head and tail for eviction pop
    // and recent use push respectively 
    Node *head;
    Node *tail;
    Node **hashtable;
    int capacity;
    int curr_size;
} LRUCache;

void deleteNode(Node *);

void addprevNode(Node *, Node *);

void printCache(LRUCache *);

LRUCache* lRUCacheCreate(int);

int lRUCacheGet(LRUCache*, int);

void lRUCachePut(LRUCache*, int, int);

void lRUCacheFree(LRUCache*);

/*
  Aware of special cases:
  1. empty cache
  2. one node cache
  3. two nodes cache
*/
void deleteNode(Node *target){
    assert(target != NULL);
    // includes cases for end nodes and non-end nodes
    if(target->prev != NULL){
        target->prev->next = target->next;
    }
    if(target->next != NULL){
        target->next->prev = target->prev;
    }
    // if target is the only node in the table, nothing happens
}

/*
  Aware of special cases:
  1. empty cache
  2. one node cache
  3. target is the original head node
  4. target = newNode!
*/
void addprevNode(Node *target, Node *newNode){
    assert(target  != NULL);
    assert(newNode != NULL);

    // case of target = newNode!
    if (target == newNode){
        return;
    }
    
    newNode->next = target;
    newNode->prev = target->prev;
    target->prev  = newNode;
    
    // if target is not the original head 
    // and newNode becomes the new head
    if(newNode->prev != NULL){
        newNode->prev->next = newNode;
    }
}

void printCache(LRUCache *obj){
    assert(obj != NULL);
    Node *current = obj->head;
    printf("---------------Cache head--------------- \n");
    while(current){
        printf("Key-value pair of current node is : %d-%d \n", current->key, current->val);
        current = current->next;
    }
    printf("---------------Cache tail--------------- \n");
    printf("\n");
}

LRUCache* lRUCacheCreate(int capacity){
    const int MAX_KEY_SIZE = 10000;
    int index;
    LRUCache *Cache = (LRUCache *)malloc(sizeof(LRUCache));
    assert(Cache != NULL);
    Cache->capacity  = capacity;
    Cache->hashtable = (Node **)malloc(sizeof(Node *) * MAX_KEY_SIZE);
    assert(Cache->hashtable != NULL);
    for(index = 0; index < MAX_KEY_SIZE; index++){
        Cache->hashtable[index] = NULL;
    }
    Cache->curr_size = 0;
    Cache->head      = NULL;
    Cache->tail      = NULL;
    
    return Cache;
}

/*
  Aware of special cases:
  1. empty cache
  2. one node cache
  3. modifying head and tail nodes
*/
int lRUCacheGet(LRUCache* obj, int key){
    // 1. compare the key with keys in keylist in interation
    // 2. if match, move that node to tail for recent use
    // So, need to use deletenode and addprevnode!
    assert(obj != NULL);
    assert(obj->capacity > 0);
    assert(obj->hashtable != NULL);
    
    int result = -1;
    // Requires O(1) so must uses hashtable
    Node *targetNode;
    // if key value pair exists in cache

    // may be improved by using curr_size when dealing removing head node and re-inserting it
    
    if(obj->hashtable[key] != NULL){
        targetNode = obj->hashtable[key];
        result     = targetNode->val;
        deleteNode(targetNode);
         
        // case of modifying head and tail nodes
        if(targetNode == obj->head && targetNode->next != NULL){
            obj->head = targetNode->next;
        }
        if(targetNode == obj->tail && targetNode->prev != NULL){
            obj->tail = targetNode->prev;
        }
        addprevNode(obj->head, targetNode);
        obj->head = targetNode;
    }
    printf("Value to key %d is : %d, and cache after fetching it is: \n", key, result);
    printCache(obj);
    return result;
}

/*
  Aware of special cases:
  1. empty cache
  2. one node cache
  3. modifying head and tail nodes
*/
void lRUCachePut(LRUCache* obj, int key, int value){
    // 1. compare the key with keys in keylist in interation
    // 2. if match, update the the value corresponds to the key
    // 3. if doesn't match, if current size exceeds capacity, evicts the tail         
    //    node and add new node at tail; if current size doesn't exceed, ad new       
    //    node at tail
    // So, need to use deletenode and addprevnode!
    assert(obj != NULL);
    assert(obj->capacity > 0);
    assert(obj->hashtable != NULL);
    
    // if the key-value pair already exists
    Node *targetNode;
    if(obj->hashtable[key] != NULL){
        targetNode      = obj->hashtable[key];
        targetNode->val = value;
        deleteNode(targetNode);
         
        // case of modifying head and tail nodes
        if(targetNode == obj->head && targetNode->next != NULL){
            obj->head = targetNode->next;
        }
        if(targetNode == obj->tail && targetNode->prev != NULL){
            obj->tail = targetNode->prev;
        }
        addprevNode(obj->head, targetNode);
        obj->head = targetNode;
        
        printf("Value of key %d is updated to : %d \n", key, value);
    }
    // if the key-value pair doesn't exist yet
    else{
        // if exceeds the capacity after adding the new key-value pair 
        if(obj->curr_size >= obj->capacity){
            Node *evictNode = obj->tail;
            obj->tail = evictNode->prev;
            // case of one node cache
            if(evictNode == obj->head){
                obj->head = evictNode->prev;
            }     
            deleteNode(evictNode);
            obj->hashtable[evictNode->key] = NULL;
            obj->curr_size--;
        }
        // if doesn't exceeds the capacity after adding the new key-value pair
        Node *newNode = (Node *)malloc(sizeof(Node));
        assert(newNode != NULL);
        newNode->prev = NULL;
        newNode->next = NULL;
        newNode->key  = key;
        newNode->val  = value;
        obj->hashtable[key] = newNode;
        obj->curr_size++;
        // for the initial condition of empty cache
        if(obj->head == NULL){
            obj->head = newNode;
            obj->tail = newNode;
        }
        else{
            addprevNode(obj->head, newNode);
            obj->head = newNode;
        }
    }
    printf("Key-value pair : %d-%d is inserted into the cache \n", key, value);
    printCache(obj);
    //printf("Key-value pair : %d-%d is evicted from the hashtable!\n", evictNode->key, evictNode->val);
}

void lRUCacheFree(LRUCache* obj){
    assert(obj != NULL);
    assert(obj->capacity > 0);
    assert(obj->hashtable != NULL);
    Node *currNode = obj->head;
    
    while(currNode){
        Node *tempNode = currNode;
        // Why is freeing here causing heap use-after-free?
        // Because currNode and tempNode points to the same segment in memory
        // free(tempNode); 
        printf("Key-value pair : %d-%d in the cache is freed! \n", tempNode->key, tempNode->val);
        currNode = currNode->next;
        free(tempNode);
    }
    free(obj->hashtable);
    free(obj);
    printf("Cache freed successfully! \n");
    return;
}

int main(){
    LRUCache* obj = lRUCacheCreate(2);
    int param_1;
    lRUCachePut(obj, 1, 1);
    lRUCachePut(obj, 2, 2);
    lRUCachePut(obj, 3, 3);
    param_1 = lRUCacheGet(obj, 2);
    lRUCacheFree(obj);
    return 0;    
}


And if there's anything that I can improve, maybe the coding style?
Any help will be appreciated! Thx:)
Last edited on
drive by looking...

yes, its because they are the same, and also the print if you had that when it went wrong.
1
2
3
4
5
6
7
8
9
10
11
12
    while(currNode){
        Node *tempNode = currNode;
        // Why is freeing here causing heap use-after-free?
        // Because currNode and tempNode points to the same segment in memory
        // free(tempNode); 


        printf("Key-value pair : %d-%d in the cache is freed! \n", tempNode->key, tempNode->val);


        currNode = currNode->next;
        free(tempNode);


its running ok as-is. what breaks it?
Last edited on
The problem is that when using small test case, like the one in main, compiler doesn't show any error. However, when using Valgrind to check memory usage, it shows the below error message:


==87== HEAP SUMMARY:
==87==     in use at exit: 24 bytes in 1 blocks
==87==   total heap usage: 6 allocs, 5 frees, 81,128 bytes allocated
==87==
==87== 24 bytes in 1 blocks are definitely lost in loss record 1 of 1
==87==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==87==    by 0x109829: lRUCachePut (LRU_Cache.c:220)
==87==    by 0x109A55: main (LRU_Cache.c:293)
==87==
==87== LEAK SUMMARY:
==87==    definitely lost: 24 bytes in 1 blocks
==87==    indirectly lost: 0 bytes in 0 blocks
==87==      possibly lost: 0 bytes in 0 blocks
==87==    still reachable: 0 bytes in 0 blocks
==87==         suppressed: 0 bytes in 0 blocks
==87==
==87== For lists of detected and suppressed errors, rerun with: -s
==87== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)


Why is there 6 allocs in total? Isn't there 2 malloc() in lRUCacheCreate(), and one for each lRUCachePut(), so a total of 5 allocs? And since the message shown in lRUCacheFree() implies all nodes in cache are freed, why is there lost memory?

What's more is that when testing the code with larger test case, say cache capacity = 100000, and much more cache instructions, compiler shows heap-buffer-overflow.
Last edited on
Ok, looking.
in the mean time, consider that NULL is zero, and take a look at calloc. This would save the loop over pointers setting them all NULL.
there are 5 allocations, even with larger numbers, which do not crash my machine.

update: adding a loop to insert many items is crashing it.

214ish
this looks odd, but I think its ok... ? its misleading to a casual reader, but its fine.

deleteNode(evictNode);
obj->hashtable[evictNode->key] = NULL;
obj->curr_size--;
Last edited on
the problem is [key] is not being checked against the size of the hashtable in lRUCachePut on several lines. If you try to insert with a key that is too large, it will crash.
check every instance of [key] in the code; its crashing on your test (large # of inserts) due to the cacheput but there are others in the code, eg 149

also, take anything valgrind says with a grain of salt. Its a powerful tool, but its JOB is to REPORT POTENTIAL problems. It can and will report problems that do not exist right beside real issues.

I personally find trying to read code with misaligned braces aggravating. Its a valid style, but its difficult to follow.
consider
foo()
{ //aligned with
...
} //this one

vs
foo(){ //not aligned
...
}//with anything.

Honestly, though, if you are new to C you are well on your way. Its pretty solid looking code, and the bug was subtle, tied to const int MAX_KEY_SIZE = 10000 in a less than obvious way.

the only complaint my compiler had on max grumpy setting was that you have an unused variable.
Last edited on
And if there's anything that I can improve, maybe the coding style?
Any help will be appreciated! Thx:)

No range check on hashtable index.

const is your friend, use it.
- e.g. void printCache(const LRUCache* obj)

node creation logic split across two functions: addPrevNode shouldn't exist, there should be a single createNode function, and a proper deleteNode function that act as a pair.

Although you call deleteNode when you evict a node from the cache, you never actually delete it, that's your leak. That would be obvious if there was some symmetry between your create<Stuff> and delete<Stuff> in your code.
Topic archived. No new replies allowed.