Error: Cluster is being concurrently upgraded from 4.13.x to 5.0.x. Please retry establishing connection.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Error: Cluster is being concurrently upgraded from 4.13.x to 5.0.x. Please retry establishing connection.

John West
We are in the process of upgrading two of our CDH clusters from version CDH 5.12  (Hbase 1.2.0) to CDH 6.3 (hbase 2.1.0)
The first cluster went without any hiccup, and Phoenix 5.0.X worked without any issue with hbase.

But on the second cluster, which is slightly larger than the first cluster,  we keep getting the following error, which I think is related to https://issues.apache.org/jira/browse/PHOENIX-4653

 Error: Cluster is being concurrently upgraded from 4.13.x to 5.0.x. Please retry establishing connection. (state=INT12,code=2010)
org.apache.phoenix.exception.UpgradeInProgressException: Cluster is being concurrently upgraded from 4.13.x to 5.0.x. Please retry establishing connection.
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.acquireUpgradeMutex(ConnectionQueryServicesImpl.java:3500)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.upgradeSystemTables(ConnectionQueryServicesImpl.java:3083)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2625)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2532)
        at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2532)
        at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
        at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
        at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
        at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
        at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
        at sqlline.Commands.close(Commands.java:906)
        at sqlline.Commands.closeall(Commands.java:880)
        at sqlline.SqlLine.begin(SqlLine.java:714)
        at sqlline.SqlLine.start(SqlLine.java:398)
        at sqlline.SqlLine.main(SqlLine.java:291)

On hbase shell , list, I noticed there is a SYSTEM.MUTEX table, with content as follow:
hbase(main):002:0> scan 'SYSTEM.MUTEX'
ROW                                                                              COLUMN+CELL
 \x00SYSTEM\x00CATALOG                                                           column=0:UPGRADE_MUTEX, timestamp=1569387929428, value=UPGRADE_MUTEX_LOCKED
1 row(s)
Took 0.1329 seconds

Is it safe to just disable, and drop the SYSTEM.MUTEX table from hbase, and try rerunning Phoenix sqlline.py again ?

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Error: Cluster is being concurrently upgraded from 4.13.x to 5.0.x. Please retry establishing connection.

John West
I had to add, we had one corrupt old hbase table during the upgrade on the second cluster, which prevent hbase from starting correctly.
That corrupt table had been dropped completely


Earlier:
2019-09-24 16:24:36,469 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of region=old_metrics,8701203$lamp,1400225340230.551495036e5680480ed00c2045438676.
java.io.IOException: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://cluster/hbase/data/default/old_metrics/551495036e5680480ed00c2045438676/d/b5ef4e30ce6c4bd09fa6279a4edc68ec
        at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1081)
        at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:942)
        at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:898)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7241)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7200)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7172)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7130)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7081)
        at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
        at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)


On Tue, Sep 24, 2019 at 11:14 PM John West <[hidden email]> wrote:
We are in the process of upgrading two of our CDH clusters from version CDH 5.12  (Hbase 1.2.0) to CDH 6.3 (hbase 2.1.0)
The first cluster went without any hiccup, and Phoenix 5.0.X worked without any issue with hbase.

But on the second cluster, which is slightly larger than the first cluster,  we keep getting the following error, which I think is related to https://issues.apache.org/jira/browse/PHOENIX-4653

 Error: Cluster is being concurrently upgraded from 4.13.x to 5.0.x. Please retry establishing connection. (state=INT12,code=2010)
org.apache.phoenix.exception.UpgradeInProgressException: Cluster is being concurrently upgraded from 4.13.x to 5.0.x. Please retry establishing connection.
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.acquireUpgradeMutex(ConnectionQueryServicesImpl.java:3500)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.upgradeSystemTables(ConnectionQueryServicesImpl.java:3083)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2625)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2532)
        at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2532)
        at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
        at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
        at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
        at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
        at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
        at sqlline.Commands.close(Commands.java:906)
        at sqlline.Commands.closeall(Commands.java:880)
        at sqlline.SqlLine.begin(SqlLine.java:714)
        at sqlline.SqlLine.start(SqlLine.java:398)
        at sqlline.SqlLine.main(SqlLine.java:291)

On hbase shell , list, I noticed there is a SYSTEM.MUTEX table, with content as follow:
hbase(main):002:0> scan 'SYSTEM.MUTEX'
ROW                                                                              COLUMN+CELL
 \x00SYSTEM\x00CATALOG                                                           column=0:UPGRADE_MUTEX, timestamp=1569387929428, value=UPGRADE_MUTEX_LOCKED
1 row(s)
Took 0.1329 seconds

Is it safe to just disable, and drop the SYSTEM.MUTEX table from hbase, and try rerunning Phoenix sqlline.py again ?

Thanks