How do I delete all data from solr by command? We are using solr with lily and hbase.
How can I delete data from both hbase and... moreHow do I delete all data from solr by command? We are using solr with lily and hbase.
How can I delete data from both hbase and solr?
http://lucene.apache.org/solr/4_10_0/tutorial.html#Deleting+Data
I have a multinode Hadoop cluster setup with two nodes(one master node and one slave node). Each node with 8GB RAM.
I have also configured hive on the master node. Everything is... moreI have a multinode Hadoop cluster setup with two nodes(one master node and one slave node). Each node with 8GB RAM.
I have also configured hive on the master node. Everything is up and working.
Nodemanager and Datanode are working on the slave node.
ResourceManager, Namenode, and SecondaryNamenode are also working on the master node.
I am able to access the hive terminal as well, but I am not able to drop the database through the drop database databaseName; command. It is not showing any error but has been stuck for more than an hour... Three tables have size 10000 * 20. I thought these may be causing the speed issues, so I wanted to delete the database, but am not able to delete via drop database command, so is there any way to do it directly by deleting any files?
I have tried to access hive.metastore.warehouse.dir to delete the database directly, but this directory is completely empty.
Similar slow behavior can be observed with other hive commands as well. I am just able to run one... less
I am new to hive, and want to know if there is anyway to insert data into hive table like we do in SQL. I want to insert my data into hive likeINSERT INTO tablename VALUES... moreI am new to hive, and want to know if there is anyway to insert data into hive table like we do in SQL. I want to insert my data into hive likeINSERT INTO tablename VALUES (value1,value2..)I have read that you can load the data from a file to hive table or you can import data from one table to hive table but is there any way to append the data as in SQL?
0 While executing any command in hbase shell, I am receiving the following error "ERROR: KeeperErrorCode = NoNode for /hbase/master" in hbase shell.Started HBASE :... more0 While executing any command in hbase shell, I am receiving the following error "ERROR: KeeperErrorCode = NoNode for /hbase/master" in hbase shell.Started HBASE : HOSTCHND:hbase-2.0.0 gvm$ ./bin/start-hbase.sh localhost: running zookeeper, logging to /usr/local/Cellar/hbase-2.0.0/bin/../logs/hbase-gvm-zookeeper-HOSTCHND.local.out running master, logging to /usr/local/Cellar/hbase-2.0.0/logs/hbase-gvm-master-HOSTCHND.local.out : running regionserver, logging to /usr/local/Cellar/hbase-2.0.0/logs/hbase-gvm-regionserver-HOSTCHND.local.outWhile Checking status in HBASE SHELL : hbase(main):001:0> status ERROR: KeeperErrorCode = NoNode for /hbase/master Show cluster status. Can be 'summary', 'simple', 'detailed', or 'replication'. The default is 'summary'. Examples: hbase> status hbase> status 'simple' hbase> status 'summary' hbase> status 'detailed' hbase> status 'replication' hbase> status 'replication', 'source' hbase> status 'replication', 'sink' Took 9.4096 seconds hbase(main):002:0> hbase-site.xml... less
We have over 100m rows in big query of analytics data. Each record is an event attached to an id.
A simplification:
ID EventId Timestamp
Is it possible to flatten this to one... moreWe have over 100m rows in big query of analytics data. Each record is an event attached to an id.
A simplification:
ID EventId Timestamp
Is it possible to flatten this to one table holding rows like:
ID timestamp-period event1 event2 event3 event4
Where the event columns hold the counts of the number of events for that id in that time period?
So far, i've managed to do it on small data sets with 2 queries. One to create rows that hold counts for an individual event id and another to flatten these in to one row after. The reason I haven't yet been able to do this accross the whole data set is that bigquery runs out of resources - not entirely sure why.
These two queries look something like this:
SELECT VideoId, date_1, IF(EventId = 1, INTEGER(count), 0) AS user_play, IF(EventId = 2, INTEGER(count), 0) AS auto_play, IF(EventId = 3, INTEGER(count), 0) AS pause, IF(EventId = 4, INTEGER(count), 0) AS replay, IF(EventId = 5, INTEGER(count), 0) AS stop, IF(EventId = 6, INTEGER(count), 0) AS seek,... less
I want to export all collections in MongoDB by the command:
mongoexport -d dbname -o... moreI want to export all collections in MongoDB by the command:
mongoexport -d dbname -o Mongo.json
The result is:No collection specified!
The manual says, if you don't specify a collection, all collections will be exported.However, why doesn't this work?
http://docs.mongodb.org/manual/reference/mongoexport/#cmdoption-mongoexport--collection
My MongoDB version is 2.0.6.
I am trying to understand what would be the best big data solution for reporting purposes?
Currently I narrowed it down to HBase vs Hive.
The use case is that we have hundreds of... moreI am trying to understand what would be the best big data solution for reporting purposes?
Currently I narrowed it down to HBase vs Hive.
The use case is that we have hundreds of terabytes of data with hundreds different files. The data is live and gets updated all the time. We need to provide the most efficient way to do reporting. We have dozens different reports pages where each report consist of different type of numeric and graph data. For instance:
Show all users that logged in to the system in the last hour and their origin is US.
Show a graph with the most played games to the least played games.
From all users in the system show the percentage of paying vs non paying users.
For a given user, show his entire history. How many games he played? What kind of games he played. What was his score in each and every game?
The way I see it, there are 3 solutions:
Store all data in Hadoop and do the queries in Hive. This might work but I am not sure about the performance. How will it perform when the... less
I have a HBase with the 750GB data. All data in the HBase are time series sensor data. And, my row key design is like this;
deviceID,sensorID,timestamp
I want to prepare all data... moreI have a HBase with the 750GB data. All data in the HBase are time series sensor data. And, my row key design is like this;
deviceID,sensorID,timestamp
I want to prepare all data in the hbase for batch processing(for example, CSV format on the HDFS). But there is a lot of data in the hbase. Can I prepare data using hive without getting data partially? Because, if I will get data using sensor id(scan query with start-end row), I must specify start and end row for each time. I don't want do this.
From the official Hive documentation:Hive aims to provide acceptable (but not optimal) latency for interactive data browsing, queries over small data sets or test queries.
I'm... moreFrom the official Hive documentation:Hive aims to provide acceptable (but not optimal) latency for interactive data browsing, queries over small data sets or test queries.
I'm not an expert about database architecture, and I would like to know if there is an alternative when the assumption above is not true, that is, when queries are made over a big data set.
From my understanding, Hbase is the Hadoop database and Hive is the data warehouse.
Hive allows to create tables and store data in it, you can also map your existing HBase tables to Hive and operate on them.
why we should use hbase if hive do all that? can we use hive by itself? I'm confused :(