getting correct timestamp from cassandra using datastax python-driver

  • Last Update :
  • Techknowledgy :

When inserting timestamps, the driver handles serialization for the write path as follows:,This document is meant to provide on overview of the assumptions and limitations of the driver time handling, the reasoning behind it, and describe approaches to working with these types.,These do not contain timezone information intrinsically, so they will be assumed to be UTC and not shifted. When generating timestamps in the application, it is clearer to use datetime.utcnow() to be explicit about it.,If the datetime object is timezone-aware, the timestamp is shifted, and represents the UTC timestamp equivalent.

Note the second point above applies even to “local” times created using now():

>>> d = datetime.now()

   >>>
   print(d.tzinfo)
None

The decision for how to handle timezones is left to the application. For the most part it is straightforward to apply localization to the datetimes returned by queries. One prevalent method is to use pytz for localization:

import pytz
user_tz = pytz.timezone('US/Central')
timestamp_naive = row.ts
timestamp_utc = pytz.utc.localize(timestamp_naive)
timestamp_presented = timestamp_utc.astimezone(user_tz)

Suggestion : 2

I am retrieving timestamps from a table using the datastax python-driver. What I am trying to do is store the previously retrieved timestamp in a var and use it in the next query to retrieve a timestamp greater than the previous one. The query basically looks like this:

cqlsh > SELECT insert_time, message FROM cf WHERE message_key = 'q1'
AND insert_time > '2013-10-30 10:32:44+0530'
ORDER BY insert_time ASC LIMIT 1;

insert_time | message
   -- -- -- -- -- -- -- -- -- -- -- -- -- + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
2013 - 10 - 30 10: 32: 45 + 0530 | 83500612412011e3 ab6c1c3e84abd9db

As you can see the timestamp from CQL is 2013-10-30 10:32:45+0530. But when i retrieve it via python-driver the results are different( I am executing the python query on a different system and not on any of the cass nodes ):

>>> from cassandra.cluster
import Cluster
   >>>
   c = Cluster([10.60 .60 .2]) >>>
   session = c.connect() >>>
   q = "SELECT insert_time, message FROM cf WHERE message_key='q1' AND insert_time>'2013-10-30 10:32:44+0530' ORDER BY insert_time ASC LIMIT 1" >>>
   rows = session.execute(q) >>>
   print rows[Row(insert_time = datetime.datetime(2013, 10, 30, 5, 2, 45, 4000), message = u '83500612412011e3ab6c1c3e84abd9db')] >>>
   timestamp = rows[0][0] >>>
   print t
2013 - 10 - 30 05: 02: 45.004000

Suggestion : 3

The workaround is applying timestamp to each statement, then Cassandra would resolve to the statement with the lastest timestamp.,If timestamps are different, pick the column with the largest timestamp (the value being a regular column or a tombstone),Statement Ordering is not supported by CQL3 batches. Therefore, once cassandra needs resolving conflict(Updating the same column in one batch), The algorithm below would be used.,If timestamps are the same, and none of the columns are tombstones, pick the column with the largest value

class MyMode(Model):
   id = columns.Integer(primary_key = True)
count = columns.Integer()
text = columns.Text()

with BatchQuery() as b:
   MyModel.batch(b).create(id = 1, count = 2, text = '123')
MyModel.batch(b).create(id = 1, count = 3, text = '111')

assert MyModel.objects(id = 1).first().count == 3
assert MyModel.objects(id = 1).first().text == '123'
with BatchQuery() as b:
   MyModel.timestamp(datetime.now()).batch(b).create(id = 1, count = 2, text = '123')
MyModel.timestamp(datetime.now()).batch(b).create(id = 1, count = 3, text = '111')

assert MyModel.objects(id = 1).first().count == 3
assert MyModel.objects(id = 1).first().text == '111'
class MyModel(Model):
   id = columns.Integer(primary_key = True)
text = columns.Text()

m = MyModel.create(id = 1, text = 'We can delete this with None')
assert MyModel.objects(id = 1).first().text is not None

m.update(text = None)
assert MyModel.objects(id = 1).first().text is None

Suggestion : 4

Timestamps can be input in CQL either using their value as an integer, or using a string that represents an ISO 8601 date. For instance, all of the values below are valid timestamp values for Mar 2, 2011, at 04:05:00 AM, GMT:,For timestamps, a date can be input either as an integer or using a date string. In the later case, the format should be yyyy-mm-dd (so '2011-02-03' for instance).,For timestamps, a time can be input either as an integer or using a string representing the time. In the later case, the format should be hh:mm:ss[.fffffffff] (where the sub-second precision is optional and if provided, can be less than the nanosecond). So for instance, the following are valid inputs for a time:,In other words, a UDT literal is like a map` literal but its keys are the names of the fields of the type. For instance, one could insert into the table define in the previous section using:

cql_type:: = native_type | collection_type | user_defined_type | tuple_type | custom_type
native_type:: = ASCII | BIGINT | BLOB | BOOLEAN | COUNTER | DATE |
   DECIMAL | DOUBLE | DURATION | FLOAT | INET | INT |
   SMALLINT | TEXT | TIME | TIMESTAMP | TIMEUUID | TINYINT |
   UUID | VARCHAR | VARINT
INSERT INTO RiderResults(rider, race, result)
VALUES('Christopher Froome', 'Tour de France', 89 h4m48s);
INSERT INTO RiderResults(rider, race, result)
VALUES('BARDET Romain', 'Tour de France', PT89H8M53S);
INSERT INTO RiderResults(rider, race, result)
VALUES('QUINTANA Nairo', 'Tour de France', P0000 - 00 - 00 T89: 09: 09);
collection_type:: = MAP '<'
cql_type ','
cql_type '>' |
   SET '<'
cql_type '>' |
   LIST '<'
cql_type '>'
collection_literal:: = map_literal | set_literal | list_literal
map_literal:: = '\{' [term ':'
   term(','
      term: term) *
]
'}'
set_literal:: = '\{' [term(','
   term) * ]
'}'
list_literal:: = '[' [term(','
   term) * ]
']'
CREATE TABLE users (
   id text PRIMARY KEY,
   name text,
   favs map<text, text> // A map of text keys, and text values
);

INSERT INTO users (id, name, favs)
   VALUES ('jsmith', 'John Smith', { 'fruit' : 'Apple', 'band' : 'Beatles' });

// Replace the existing map entirely.
UPDATE users SET favs = { 'fruit' : 'Banana' } WHERE id = 'jsmith';

Suggestion : 5

How do I configure and execute BatchStatement in Cassandra correctly? ,How do I configure and execute BatchStatement in Cassandra correctly?,How do I get the size of a Cassandra table using the Python driver? ,How do I get the size of a Cassandra table using the Python driver?

SELECT * FROM system_views.disk_usage
WHERE keyspace_name = 'stackoverflow'
AND table_name = 'baseball_stats';

keyspace_name | table_name | mebibytes
   -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- + -- -- -- -- -- -
   stackoverflow | baseball_stats | 1

(1 rows)
1._
MATCH(d: data)
RETURN apoc.date.format(d.submitted, 'ms', 'YYYY-MM') AS month,
   avg(d.score) AS score
ORDER BY month DESC
LIMIT 12
MATCH (d:data) 
RETURN apoc.date.format(d.submitted, 'ms', 'YYYY-MM') AS month,
       avg(d.score) AS score
ORDER BY month DESC
LIMIT 12
MATCH(d: data)
WITH d, datetime({
   epochMillis: d.submitted
}) as dt
RETURN dt.year as year,
   dt.month as month,
   avg(d.score) AS score
ORDER BY year DESC, month DESC
LIMIT 12
MATCH (d:data)
WITH d, datetime({epochMillis:d.submitted}) as dt
RETURN dt.year as year, 
       dt.month as month,
       avg(d.score) AS score
ORDER BY year DESC, month DESC
LIMIT 12
WITH date() AS today
UNWIND[
   i IN range(0, 11) |
   datetime.truncate('month', today - duration({
      months: i
   }))
] AS firstDayOfMonth
OPTIONAL MATCH(A: data)
WHERE A.submitted >= timestamp(firstDayOfMonth) AND
A.submitted < timestamp(firstDayOfMonth + duration({
   months: 1
}))
RETURN apoc.date.format(timestamp(firstDayOfMonth), 'ms', 'YYYY-MM') AS month,
   coalesce(avg(A.score), 0) AS score
...
batch = BatchStatement(batch_type = BatchType.UNLOGGED)
batch.add(SimpleStatement(cql_statement), (name_1, age_1))
batch.add(SimpleStatement(cql_statement), (name_2, age_2))
batch.add(SimpleStatement(cql_statement), (name_3, age_3))
session.execute(batch)
1._
cve_ = colCVE.aggregate([{
   "$sort": {
      "Modified": -1
   }
}], allowDiskUse = True)
cve_ = colCVE.aggregate([{"$sort": {"Modified": -1}}], allowDiskUse=True)
cve_ = colCVE.find(allow_disk_use = True).sort("Modified", pymongo.DESCENDING)
cve_ = colCVE.find(allow_disk_use=True).sort("Modified", pymongo.DESCENDING)
cve_ = colCVE.aggregate([{
   "$sort": {
      "Modified": -1
   }
}], allowDiskUse = True)
from cassandra.cluster
import Cluster
from cassandra.auth
import PlainTextAuthProvider
import os

cloud_config = {
   'secure_connect_bundle': '/path/to/secure-connect-dbname.zip'
}
auth_provider = PlainTextAuthProvider(
   username = os.environ['CASSANDRA_USERNAME'],
   password = os.environ['CASSANDRA_PASSWORD'])
cluster = Cluster(cloud = cloud_config, auth_provider = auth_provider)
session = cluster.connect()
docker run - it--add - host = neo4j: [your - host - ip] user / test - neo4j: latest

Suggestion : 6

Solved. Needed to use the latest Datastax drivers for blob, and the above INSERT method (not string converted), and proper pickle and bytearray.,1) How do I make a cql timestamp in python: timenow? This did not help (too complicated for my Cassandra level): Cassandra 1.2 inserting/updating a blob column type using Python and the cql library,I am writing some python code that will collect data over time. I need to store this in Cassandra. I have spent my entire day on this but cannot find something that works., Python module for working with Cassandra database is called Cassandra Driver. It is also developed by Apache foundation. This module contains an ORM API, as well as a core API similar in nature to DB-API for relational databases. Installation of Cassandra driver is easily done using pip utility.


CREATE TABLE timearchive(name_yymmddhh text, name text, ip text, time_current timestamp, data blob, PRIMARY KEY(name_yymmddhh, time_current));
CREATE TABLE timearchive(name_yymmddhh text, name text, ip text, time_current timestamp, data blob, PRIMARY KEY(name_yymmddhh, time_current));
query = ""
"INSERT INTO timearchive (name_yymmddhh, name, ip, time_current, data) VALUES (:name_yymmddhh, :name, :ip, :time_current, :data)"
""
values = {
   'name_yymmddhh': rowkey,
   'name': dcname,
   'ip': ip,
   'time_current': timenow,
   'data': my_blob
}
cursor.execute(query, values)
my_dict = {
   'one': 1,
   'two': 2,
   'three': 3
}...my_blob = ?? ?