2016-03-22

Amazon Kinesis Client (KCL) V2 IRecordProcessor

If you are using the KCL starting with version 1.5.0 there are some changes as stated in the official documentation. [http://docs.aws.amazon.com/kinesis/latest/dev/kinesis-record-processor-implementation-app-java.html].

There is a new version of record processor interface.
In this article I want to give some details about the initialize method.

IRecordProcessor V2
 void initialize(InitializationInput initializationInput)  
 void processRecords(ProcessRecordsInput processRecordsInput)  
 void shutdown(ShutdownInput shutdownInput)  

initialize

InitializationInput

In addition to the current processed shardId you get an instance of ExtendedSequenceNumber. The documentation states that this is either the record processor starting sequence number or the last uncommitted two-phase checkpoint.

I use this sequence number to initialize an internal last processed marker in my record processor implementation.

 

Edge case

After some problems with the DynamoDB metadata table the content was dropped by a colleague. At that point my record processor behaved strange. After checking the log file I wondered about the given sequence number "LATEST".

 

Solution

After some investigation I found some static definitions within the ExtendedSequenceNumber. These are symbolic pointer within the shard stream. Using these, you are able to tell if the current sequence number is really a sequence number or a stream pointer. I found some helper methods but all of them are declared private.

ExtendedSequenceNumber

   /**  
    * Special value for LATEST.  
    */  
   public static final ExtendedSequenceNumber LATEST =   
       new ExtendedSequenceNumber(SentinelCheckpoint.LATEST.toString());  
   /**  
    * Special value for SHARD_END.  
    */  
   public static final ExtendedSequenceNumber SHARD_END =   
     new ExtendedSequenceNumber(SentinelCheckpoint.SHARD_END.toString());  
   /**  
    *   
    * Special value for TRIM_HORIZON.  
    */  
   public static final ExtendedSequenceNumber TRIM_HORIZON =   
       new ExtendedSequenceNumber(SentinelCheckpoint.TRIM_HORIZON.toString());  

Some reverse engineering

But I still wondered why these "pointer" sequence numbers are provided. After digging into the code and examining the client initialization I found the reason for this behavior.
A LeaseManager instance looking for the DynamoDB metadata table. If it doesn't exists a new one is generated. After that a new KinesisClientLease instance is generated for the InitialPositionInStream enumeration value given using a instance of KinisisClientLibConfiguration when building the Worker instance. That is converted into a ExtendedSequenceNumber enumeration instance. Now the KinesisClientLease instance is serialized into the metadata table. Finally the initialize method is called passing the sequence number from the KinesisClientLease.

2016-03-09

Native Docker Feeling For Windows Using Git Bash

Docker on Windows is easy because of the native client port. The CLI tools are pretty well integrated for usage with PowerShell. But all available examples are stated in Linux shell syntax and translating all of them is into PS syntax is most times annoying and alienating.

But the docker toolbox comes for rescue as it requires Git for Windows. This git distribution installs a version of MinGW and the MSYS tools providing the Git Bash shell similar to what you will find on Linux machines.

Using bash you can control any docker host on reachable machines. If you have problems with the default terminal emulation you must use winpty. There is a little culture clash between Linux and Windows console output.

For me this is really a productivity booster. Because combined with ssh you can use one shell instance for almost all adventures to remote Linux machines.

2015-11-05

VirtualBox network blues

If it happen to you like me and you install the latest VirtualBox version on Windows 7. Setup is running smoothly and everything looks fine. But when you try something like the docker toolbox and all you get some strange errors. This could point to a problem with the network stack.

In this case remove your current installation and do a restart. Now start the install, e.g. via "Run" like this:

VirtualBox-x.x.x-yyyyy-Win.exe -msiparams NETWORKTYPE=NDIS5

This tells the installer to use the older version of the VirtualBox network stack. Using this the docker toolbox works like charm.

2015-09-08

Windows 10 Remote Desktop Via RD Gateway fails

Using Windows 10 there was a problem using the remote desktop client to connect to my company windows machine. It was not possible to get beyond the remote desktop gateway. All I got was: "Your computer can't connect to the remote server because the Remote Desktop Gateway server is temporarily unavailable. Try connecting later or contact your network administrator for assistance."

But today I found that workaround at: https://social.technet.microsoft.com/Forums/windowsserver/en-US/58521677-b54c-4285-9a06-9a966a9d8549/clean-install-windows-10-can-not-rdp-via-2012-rd2-rdg?forum=winserverTS

Long story short:

Add this using regedit:

HKEY_CURRENT_USER\Software\Microsoft\Terminal Server Client
Name: RDGClientTransport
Type: Dword
Data: 1

Restart all running mstsc.exe instances and everything should work as expected.

2014-09-22

Entity Framework on system databases

Entity Framework supports code only entity models. The default behavior is to deploy this model to the connected database. If you want to access system databases this is unacceptable. Fortunately this can be disabled. Unfortunately it's not easy. You have to implement an custom database initialization strategy. The following example is adding an check if the code model is compatible with the current database.

private class DatabaseInitializer : IDatabaseInitializer<SqlServerCatalog>
{

    public void InitializeDatabase(SqlServerCatalog context)
    {
        if (!context.Database.CompatibleWithModel(false))
            throw new Exception("");
    }
}

Now you habe to override the static constructor of your implementation of the DbContext class:

static YourDBContext()
{
    System.Data.Entity.Database.SetInitializer<YourDBContext>(new DatabaseInitializer());
}

2014-09-21

Docker easy with fig

Playing with Docker is easy. More easy using fig. [http://www.fig.sh/] It's very easy to create an configuration file stating all required dockerized services. The services can be easily linked by name. E.g. if you need a Zookeeper instance for one or multiple Kafka instances. [https://github.com/wurstmeister/kafka-docker/blob/master/fig.yml] The company behind is orchard laboratories. It provides cloud based docker hosts. The company was acquired by the Docker Inc. in July 2014.

2014-09-20

Active Directory user rename and SQL Server

Renaming users is easy. Spreading the news about it is hard. Especially the SQL Server is ignorant about changes. All security tokens for an user or group is cached for Performance reasons. The only way to force an update is to drop the existing cache using: DBCC FREESYSTEMCACHE ('TokenAndPermUserStore')

Format data types on SQL Server 2012+

I was required to format multiple variables and concat them into a beautiful string. I stumbled upon the FORMAT function. A build-in function since SQL Server 2012.Look at http://msdn.microsoft.com/en-us/library/hh213505(v=sql.110).aspx

Missed the part that states:
The following table lists the acceptable data types for the value argument together with their .NET Framework mapping equivalent types.

They are serious about this. Tried the bit data type and it failed miserably.
So you still have to do a
 convert([n]varchar(1), @bit_value)  
for the value as number or
 case when @bit_value = 1 then 'TRUE' when @bit_value = 0 then 'FALSE' else 'NULL' end  
for the value as text.