Tuesday, November 17, 2020

Weblogic - RAC node failure tests - Data Source / Attempt to set connection harvestable to false and/or Attempt to operate on Connection that is already closed.

Recently implemented a Weblogic Cluster and a multi node Oracle RAC Database in a mission critical project.

During the RAC node failure tests, we saw that the managed servers were getting stuck during these test.. Moreover; when we tested the the data sources on them, we got errors.

Actually, all the urls were configured properly. The datasources were configured to use SCAN and service names to connect to the Oracle RAC database.. However; the managed servers couldn't manage a RAC node failure.

Once, we checked the logs, we saw two crucial things -> 

java.io.IOException: Attempt to set connection harvestable to false but the connection is already closed.

java.sql.SQLException: Attempted operation on Connection that is already closed.

These exceptions seemed to blink at us.

It was obvious that, weblogic was harvesting the connections to ensure that a specified number of connections are always available in the pool.. Weblogic was doing it for improving performance by minimizing connection initialization, but it was still hoding the connections to the failing node, and the managed servers and our applications were getting errors while trying to use them.

There was no ONS or Gridlink configuration, so, we needed to tell the regular datasource/weblogic to not to use those connections and the quickest way for that was to set "Test Connections On Reserve" and "Test Frequency" parameters.

Information about these parameters:

Test Connections on Reserve : Enables WebLogic Server to test a connection before giving it to a client. (Requires that you specify a Test Table Name.)
The test adds a small delay in serving the client's request for a connection from the pool, but ensures that the client receives a viable connection.
This test is required for connection pools used in a multi data source that use the failover algorithm.

Test Frequency : The number of seconds between when WebLogic Server tests unused connections. (Requires that you specify a Test Table Name.) Connections that fail the test are closed and reopened to re-establish a valid physical connection. If the test fails again, the connection is closed.
When set to 0, periodic testing is disabled.

Those properties are configurable through Weblogic console.

These parameters were exactly under "Domain Structure > Services -> Data Sources > Data Source name > Configuration > Connection Pool > Advanced Options"

As for the solution; we set the "Test Connections On Reserve" to true and Test Frequency to an optimized value (in seconds) according to our environment.

Also, we choosed to use "SQL SELECT 1 FROM DUAL" , because of the following restrictions of "SQL ISVALID" ->

[ISVALID] can improve the performance; however it cannot check correctly in a few cases; When there is a problem in the parser, execution or other function (except connection) on the Database side or When Database is into a shutdown mode but not yet disconnecting.

Well, this is about failover, but what about WLS - DB load balancing? Well, Load balancing database connections really resides on the RAC DB setup :)

For your information ! :)

One more thing -> this was the quick win in this special case. Of course, the recommended configuration is actually the following;

-Configure ONS in RAC
-Convert those generic Data Source to Grid Link Data Sources
-Enable FAN and set Test Connections On Reserve to true in Weblogic.
-Select the "Remove Infected Connections Enabled" check box & Restart WLS.

No comments :

Post a Comment

If you will ask a question, please don't comment here..

For your questions, please create an issue into my forum.

Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html

Register and create an issue in the related category.
I will support you from there.