Sqoop - Oracle JDBC connection reset issue

JDBC Client Connection Reset :


Environment: JDK1.6/CDH5.3/CM5.2/ojdbc6.jar
Issue : when Trade_IMPL table sourcing happens in parallel for multiple months, first few connections established  and rest all fails with the exception below.

Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/01/15 11:36:22 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.0
15/01/15 11:36:22 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/01/15 11:36:22 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
15/01/15 11:36:22 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
15/01/15 11:36:22 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default
15/01/15 11:36:22 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is
15/01/15 11:36:22 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then
15/01/15 11:36:22 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing
15/01/15 11:36:22 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in
15/01/15 11:36:22 WARN tool.BaseSqoopTool: case that you will detect any issues.
15/01/15 11:36:22 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
15/01/15 11:36:22 INFO manager.SqlManager: Using default fetchSize of 1000
15/01/15 11:36:22 INFO tool.CodeGenTool: Beginning code generation
15/01/15 11:39:00 ERROR manager.SqlManager: Error executing statement: java.sql.SQLRecoverableException: Io exception: Connection reset
java.sql.SQLRecoverableException: Io exception: Connection reset
        at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:101)
        at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:133)
        at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:199)
        at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:263)
        at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:521)
        at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:418)
        at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:508)
        at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:203)
        at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:33)
        at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:510)
        at java.sql.DriverManager.getConnection(DriverManager.java:582)


Root Cause:

            11g JDBC driver tries to initialize the java.security.SecureRandom class to generate random numbers, possibly to be used in the client server handshake for initial session setup. On Linux, the call to generate a seed for SecureRandom can block if /dev/random does not have sufficient entropy available. And after a certain interval, the server resets the TCP connection as it sees no activity from the client. This is what the exception is about.



Sqoop seems to use the JDBC driver in two different ways:
  • First, it connects to the Oracle database directly. It does that to gather more information about the tables (or query) from where the data is extracted and generate the map reduce job that it will run.
  • Second, the map reduce job generated by Sqoop uses the JDBC driver to connect to the database and perform the actual data import.
Workaround:

1.     Add the highlighted option in GREEN to the sqoop import statement
sqoop import -D mapred.child.java.opts="\-Djava.security.egd=file:/dev/../dev/urandom"

 This works because reads from /dev/urandom do not block even in the absence of entropy and simply continue to return (pseudo-)random bytes of lower quality.

Enabling this parameter helps in reducing the time it takes to establish a JDBC connection to oracle 11g , a quick test below explains with and without this parameter

[ssethur7@Datanode3 ~]$ time java -cp /app/CTS/CML/cmldev/lib/ojdbc6.jar:.: Conn

27306

real    0m16.547s
user    0m0.889s
sys     0m0.042s
[ssethur7@Datanode3 ~]$

[ssethur7@Datanode3 ~]$ time java -Djava.security.egd=file:/dev/../dev/urandom -cp /app/CTS/CML/cmldev/lib/ojdbc6.jar:.: Conn
27306

real    0m1.017s
user    0m0.881s
sys     0m0.036s

2.    Set the environment variable for the users bash profile (~/.bash_profile)
export HADOOP_OPTS=-Djava.security.egd=file:/dev/../dev/urandom

3.    Set this value in the global level for java ($JAVA_HOME/jre/lib/security/java.security)
-Djava.security.egd=file:/dev/../dev/urandom




Comments

Popular Posts