Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Zookeeper] Server error causing client connection issue #2129

Open
mreyescdl opened this issue Dec 5, 2024 · 6 comments
Open

[Zookeeper] Server error causing client connection issue #2129

mreyescdl opened this issue Dec 5, 2024 · 6 comments
Assignees

Comments

@mreyescdl
Copy link
Contributor

On 12/03 at 02:05:30 uc3-mrtzk-prd05, which was the leader encountered a write error. This triggered an error on all peers and eventually caused Merritt Ingest client side errors.

Here is the stack trace on worker 05

2024-12-03 02:05:38,365 [myid:] - ERROR [Sender-/172.30.5.246:36818:o.a.z.s.q.LearnerHandler@372] - Exception while sending packets in LearnerHandler
java.net.SocketException: Broken pipe (Write failed)
        at java.base/java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.base/java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:110)
        at java.base/java.net.SocketOutputStream.write(SocketOutputStream.java:150)
        at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
        at java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142)
        at org.apache.zookeeper.server.quorum.LearnerHandler.sendPackets(LearnerHandler.java:334)
        at org.apache.zookeeper.server.quorum.LearnerHandler.access$200(LearnerHandler.java:65)
        at org.apache.zookeeper.server.quorum.LearnerHandler$1.run(LearnerHandler.java:751)
  • Not sure if we need to tune Zookeeper better. Should we explicitly define the Zookeeper JVM or rely on Java to set correctly.

Lets monitor for now.

@mreyescdl mreyescdl self-assigned this Dec 5, 2024
@mreyescdl
Copy link
Contributor Author

mreyescdl commented Dec 11, 2024

Encountered another Zookeeper server error which caused submission errors.
This log snippet is from Zookeeper worker 03

2024-12-10 15:38:01,949 [myid:] - WARN  [SyncThread:3:o.a.z.s.q.SendAckRequestProcessor@65] - Closing connection to leader, exception during packet send
java.net.SocketException: Broken pipe (Write failed)
        at java.base/java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.base/java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:110)
        at java.base/java.net.SocketOutputStream.write(SocketOutputStream.java:150)
        at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
        at java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142)
        at org.apache.zookeeper.server.quorum.Learner.writePacketNow(Learner.java:206)
        at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:195)
        at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:63)
        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:248)
        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:169)

@mreyescdl
Copy link
Contributor Author

Another occurrence at 2024-12-15 17:36

This time it happened during an Escholarship submission, which has very large Zookeeper payload data, (up to 50K) as well as unicode data (maybe corrupt)

Submitted 50 Eschol objects to Stage, but did not replicate problem.
Continue to look at client side cause of errors.

Here are stack traces from a) client b) ZK worker c) ZK leader

a)

BatchReportConsumerDaemon: Checking for additional tasks -  Current tasks: 0 - Max: 5
17:36:02.322 [http-nio-33121-exec-3-SendThread(uc3-mrtzk-prd03.cdlib.org:2181)] WARN  org.apache.zookeeper.ClientCnxn - Session 0x0 for server uc3-mrtzk-prd03.cdlib.org/172.30.42.133:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read additional data from server sessionid 0x0, likely server has closed socket
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1289)

b)

2024-12-15 17:36:02,004 [myid:] - WARN  [QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):o.a.z.s.q.Follower@128] - Exception when following the leader
java.net.SocketTimeoutException: Read timed out
        at java.base/java.net.SocketInputStream.socketRead0(Native Method)
        at java.base/java.net.SocketInputStream.socketRead(SocketInputStream.java:115)
        at java.base/java.net.SocketInputStream.read(SocketInputStream.java:168)
        at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140)
        at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:252)
        at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:271)
        at java.base/java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:96)
        at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:86)
        at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:134)
        at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:228)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:124)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1539)

c)

2024-12-15 17:36:03,139 [myid:] - ERROR [LearnerHandler-/172.30.13.172:57590:o.a.z.s.q.LearnerHandler@719] - Unexpected exception in LearnerHandler:
java.io.EOFException: null
        at java.base/java.io.DataInputStream.readInt(DataInputStream.java:397)
        at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:96)
        at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:86)
        at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:134)
        at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:656)

@mreyescdl
Copy link
Contributor Author

Another Zookeeper error during an Scholarship submission.
The client side error logs informs that the server closed the connection.

14:54:09.367 [pool-8-thread-2-SendThread(uc3-mrtzk-prd03.cdlib.org:2181)] WARN  org.apache.zookeeper.ClientCnxn - Session 0x0 for server uc3-mrtzk-prd03.cdlib.org/172.30.42.
133:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read additional data from server sessionid 0x0, likely server has closed socket
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1289)

@mreyescdl
Copy link
Contributor Author

mreyescdl commented Dec 18, 2024

The following documentation from the Zookeeper admin page informs that Zookeeper is designed to handle Kilobytes of data payload. If size was an issue, we would not be receiving the socket errors.

jute.maxbuffer : (Java system property:jute.maxbuffer).
- This option can only be set as a Java system property. There is no zookeeper prefix on it. It specifies the maximum size of the data that can be stored in a znode. The unit is: byte. The default is 0xfffff(1048575) bytes, or just under 1M.
- This is really a sanity check. ZooKeeper is designed to store data on the order of kilobytes in size. In the production environment, increasing this property to exceed the default value is not recommended for the following reasons:

@mreyescdl
Copy link
Contributor Author

mreyescdl commented Dec 18, 2024

The Zookeeper error (noted above) stems from Worker 4 unable to read data from Worker 3
Lets look at where our 5 Zookeeper workers are located...

uc3-mrtzk-prd01: us-west-2a
uc3-mrtzk-prd02: us-west-2b
uc3-mrtzk-prd03: us-west-2c
uc3-mrtzk-prd04: us-west-2a
uc3-mrtzk-prd05: us-west-2b

The workers are distributed across 3 AZs. This is to increase reliability.
However, when under load (from EScholarship) we are writing large data from leader to workers.
If we have 500 objects, each with a payload of 2K - we have 1MB of data being written across AZs
Noted above, the jute.maxbuffer has a default of 1MB

Would it be easier if all Zookeeper nodes were in the same AZ? Does this propose a realistic risk?
Lets discuss with @ashleygould

@mreyescdl
Copy link
Contributor Author

Running on Zookeeper worker 05 is a logger that will snapshot "srvr" info on all 5 workers every 15 minutes.
Running over the holidays there should be no issues of this logger filling up the application disk space. I estimate it will at max consume 50M and the capacity is 38G

This info will show:

  • Number of connections for each worker
  • Latency
  • Proposal sizing
  • Received/Sent requests

Reminder: Kill process after holiday break. It is located at:
uc3-mrtzk-prd05:/dpr2/tools/srvr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant