以下笔记为简略记录,详细实战内容待补充。
选择版本
hbase-0.98.6-cdh5.3.0
HBASE配置
http://hbase.apache.org/book.html#quickstart
配置hbase-env.sh
配置hbase-site.xml
配置regionservers
hadoop恢复分布式配置
[zeal@data2 etc]$ mv hadoop/ hadoop-ha
[zeal@data2 etc]$ mv dist-hadoop/ hadoop
[zeal@data2 data]$ mv tmp/ tmp-ha
[zeal@data2 data]$ mv dist-tmp/ tmp
分发hbase到其他机器
[zeal@data1 modules]$ scp -r hbase-0.98.6-cdh5.3.0/ zeal@data2.zeal.name:/opt/modules/
[zeal@data1 modules]$ scp -r hbase-0.98.6-cdh5.3.0/ zeal@data3.zeal.name:/opt/modules/
启动zookeeper
[zeal@data1 zookeeper-3.4.5-cdh5.10.0]$ sbin/zkServer.sh start
[zeal@data2 zookeeper-3.4.5-cdh5.10.0]$ sbin/zkServer.sh start
[zeal@data3 zookeeper-3.4.5-cdh5.10.0]$ sbin/zkServer.sh start
启动hdfs
[zeal@data1 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start namenode
[zeal@data1 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
[zeal@data2 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
[zeal@data3 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
启动hbase
[zeal@data1 hbase-0.98.6-cdh5.3.0]$ bin/hbase-daemon.sh start master
[zeal@data1 hbase-0.98.6-cdh5.3.0]$ bin/hbase-daemon.sh start regionserver
[zeal@data2 hbase-0.98.6-cdh5.3.0]$ bin/hbase-daemon.sh start regionserver
[zeal@data3 hbase-0.98.6-cdh5.3.0]$ bin/hbase-daemon.sh start regionserver
访问HBASE监控地址
http://data1.zeal.name:60010/master-status
HBASE数据库操作测试
*hbase上删除命令字符需按ctrl+backspace
[zeal@data1 hbase-0.98.6-cdh5.3.0]$ bin/hbase shell
hbase(main):001:0> help
Group name: general
Commands: status, table_help, version, whoami
Group name: ddl
Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters
Group name: namespace
Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables
Group name: dml
Commands: append, count, delete, deleteall, get, get_counter, incr, put, scan, truncate, truncate_preserve
Group name: tools
Commands: assign, balance_switch, balancer, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, flush, hlog_roll, major_compact, merge_region, move, split, trace, unassign, zk_dump
Group name: replication
Commands: add_peer, disable_peer, enable_peer, list_peers, list_replicated_tables, remove_peer, set_peer_tableCFs, show_peer_tableCFs
Group name: snapshots
Commands: clone_snapshot, delete_snapshot, list_snapshots, rename_snapshot, restore_snapshot, snapshot
Group name: quotas
Commands: list_quotas, set_quota
Group name: security
Commands: grant, revoke, user_permission
Group name: visibility labels
Commands: add_labels, clear_auths, get_auths, set_auths, set_visibility
The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/docs/current/book.html
hbase(main):002:0> list
=> []
hbase(main):003:0> create 'test','info'
0 row(s) in 2.1490 seconds
=> Hbase::Table - test
hbase(main):004:0> list
TABLE
test
1 row(s) in 0.0100 seconds
=> ["test"]
hbase(main):005:0> put 'test','0001','info:userName','zouyouzhen'
0 row(s) in 0.2850 seconds
hbase(main):006:0> scan 'test'
ROW COLUMN+CELL
0001 column=info:userName, timestamp=1555731031998, value=zouyouzhen
1 row(s) in 0.0340 seconds
hbase(main):007:0> put 'test','0001','info:age','30'
0 row(s) in 0.0100 seconds
hbase(main):008:0> put 'test','0001','info:tel','1300000001'
0 row(s) in 0.0110 seconds
hbase(main):009:0> scan 'test'
ROW COLUMN+CELL
0001 column=info:age, timestamp=1555731089324, value=30
0001 column=info:tel, timestamp=1555731116238, value=1300000001
0001 column=info:userName, timestamp=1555731031998, value=zouyouzhen
1 row(s) in 0.0170 seconds
hbase(main):010:0> put 'test','0002','info:userName','xx'
0 row(s) in 0.0080 seconds
hbase(main):011:0> scan 'test'
ROW COLUMN+CELL
0001 column=info:age, timestamp=1555731089324, value=30
0001 column=info:tel, timestamp=1555731116238, value=1300000001
0001 column=info:userName, timestamp=1555731031998, value=zouyouzhen
0002 column=info:userName, timestamp=1555731237997, value=xx
2 row(s) in 0.0310 seconds
hbase(main):012:0> describle 'test'
NoMethodError: undefined method `describle' for #<Object:0x61f460c1>
hbase(main):013:0> describe 'test'
DESCRIPTION ENABLED
'test', {NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', true
BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS
=> '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL
=> 'FOREVER', KEEP_DELETED_CELLS => 'false', BLOCKSIZE
=> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
1 row(s) in 0.2230 seconds
hbase(main):014:0> disable 'test'
0 row(s) in 1.5540 seconds
hbase(main):015:0> drop 'test'
0 row(s) in 1.5540 seconds
hbase(main):016:0> list
TABLE
0 row(s) in 0.0100 seconds
=> []
hbase(main):003:0> quit
HBase集群中Master-backup配置
新建conf/backup-masters
添加备份主机地址:data2.zeal.name
分发文件到其他机器
重启HBase服务
[zeal@data1 hbase-0.98.6-cdh5.3.0]$ bin/start-hbase.sh
http://data1.zeal.name:60010/master-status
http://data2.zeal.name:60010/master-status
停止data1上的master进程,hbase服务自动切换到data2上
用户查询日志下载
http://www.sogou.com/labs/resource/q.php
数据格式为
访问时间\t用户ID\t[查询词]\t该URL在返回结果中的排名\t用户点击的顺序号\t用户点击的URL
其中,用户ID是根据用户使用浏览器访问搜索引擎时的Cookie信息自动赋值,即同一次使用浏览器输入的不同查询对应同一个用户ID
hbase创建表结构
[zeal@data1 hbase-0.98.6-cdh5.3.0]$ bin/hbase shell
2019-04-22 17:13:13,002 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.98.6-cdh5.3.0, rUnknown, Tue Dec 16 19:18:44 PST 2014
hbase(main):001:0> create 'weblogs','info'
2019-04-22 17:21:00,070 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
0 row(s) in 4.4680 seconds
=> Hbase::Table - weblogs
hbase(main):002:0> list
TABLE
weblogs
1 row(s) in 0.4120 seconds
=> ["weblogs"]
hbase(main):003:0> quit
Copyright ©2017-2024 uzen.zone
湘ICP备17013178号-3