Last month, several interfaces in the online production environment experienced abnormal responses. Upon reviewing the production logs, the following error was found: Redis connection reset.
The online Redis client uses the default Lettuce
client from SpringBoot
and does not specify a connection pool. The error connection reset by peer
occurs when the current client connection is unexpectedly terminated by the server, meaning that the server has terminated the current Redis connection, but the client is unaware. When a request comes in and Lettuce
continues using the current Redis connection to request data, it will prompt connection reset by peer
.
Generally, the server will send a FIN
packet to notify the client when disconnecting. However, when I monitored the server’s TCP transmission using tcpdump
, I found that the Redis server’s TCP connection receives a RST
packet from the client after a period of inactivity, such as 10 minutes. However, my client was also using Wireshark for packet capture and did not send a RST
packet to the server. This peculiar behavior suggests that it may be due to the server’s restrictions on TCP connections, forcibly disconnecting those inactive for extended periods. Thus, the occasional connection reset by peer
error in the online environment’s Redis connection was reproduced by me.
Now that it’s clear the error is caused by Redis connections being interrupted after prolonged inactivity, how do we resolve this bug?
The author’s initial thought was to solve it with retries, but it turned out not to be that simple. Here’s the code:
// Query Redis public T getCacheObject(final String key) { try { ValueOperations<String, T> operation = redisTemplate.opsForValue(); return operation.get(key); } catch (Exception e) { log.error(e.getMessage(), e); return retryGetCacheObject(key, 3); } } // Retry querying Redis public T retryGetCacheObject(final String key, int retryCount) { try { log.info("retryGetCacheObject, key:{}, retryCount:{}", key, retryCount); if (retryCount <= 0) { return null; } Thread.sleep(200L); retryCount--; ValueOperations<String, T> operation = redisTemplate.opsForValue(); return operation.get(key); } catch (Exception e) { log.error(e.getMessage(), e); return retryGetCacheObject(key, retryCount); } }
The code above means that after an exception occurs during the first query to Redis, it retries every 200 milliseconds up to 3 times. During actual operation, it was noted that this would prompt the connection reset by peer
error three times without acquiring a new Redis connection.
At this point, my solution to this issue is essentially about how to create a new connection to replace one that fails.
Let’s cut to the chase with the code:
// Lettuce connection factory @Autowired private LettuceConnectionFactory lettuceConnectionFactory; /** * Retrieve the basic cache object. * * @param key Cache key * @return The data corresponding to the cache key */ public T getCacheObject(final String key) { try { ValueOperations<String, T> operation = redisTemplate.opsForValue(); return operation.get(key); } catch (Exception e) { log.error(e.getMessage(), e); return retryGetCacheObject(key, 1); } } public T retryGetCacheObject(final String key, int retryCount) { try { log.info("retryGetCacheObject, key:{}, retryCount:{}", key, retryCount); if (retryCount <= 0) { return null; } lettuceConnectionFactory.resetConnection(); Thread.sleep(200L); retryCount--; ValueOperations<String, T> operation = redisTemplate.opsForValue(); return operation.get(key); } catch (Exception e) { log.error(e.getMessage(), e); return retryGetCacheObject(key, retryCount); } }
When an exception occurs in obtaining data with the current Redis connection after exceeding the timeout
interval, it throws an exception and enters the retry method using lettuceConnectionFactory.resetConnection()
to reset the connection. It creates a new connection to continue obtaining data, thus normally responding to the client. The lettuceConnectionFactory
object is an implementation of a non-pooled connection factory in Lettuce
, providing methods to obtain, initialize, and reset connections, such as lettuceConnectionFactory.getConnection(); lettuceConnectionFactory.initConnection(); lettuceConnectionFactory.resetConnection();
. Configuring timeout
in springboot
sets the data retrieval timeout to 2 seconds, effectively controlling the interface request time to around 2 seconds.
redis: xx: xx timeout: 2000
With this, the bug of occasional disconnection in Lettuce
client non-pooled connections under the SpringBoot
project in the production environment is considered solved.
Finally, here’s the project link in action newbeemall, a mybatis plus version of the newbee-mall platform, implementing coupon collection, Alipay sandbox payment, back-end search addition, RedisSearch word segmentation retrieval