Thursday, September 01, 2011

Bulk Insert: An unexpected end of file was encountered in the data file

I recently encountered a error in a BULK INSERT operation. The BULK INSERT was using a non-xml format file to skip some columns in the csv file and also to map the fields in the csv file with the columns in the table. Of course, I was importing into a temp table that servers as a staging environment. And wwhile doing so I got the foll. error:

Msg 4832, Level 16, State 1, Line 2
Bulk load: An unexpected end of file was encountered in the data file.
Msg 7399, Level 16, State 1, Line 2
The OLE DB provider "BULK" for linked server "(null)" reported an error.


On further investigation, I figured out that the issue was with the "Number of columns" in the format file. My csv file had 91 columns. The temp table had 104 columns. The format file was used to skip 15 columns. Initially I had "Number of Columns" at 46 (the value was hard coded from earlier days). This caused the "An unexpected end of file" error.

default non-XML format file for myTestSkipCol



While researching this error, this is what I found:
  • When "Number of columns" is less than the actual columns mapped, the error is:
Msg 4832, Level 16, State 1, Line 2Bulk load: An unexpected end of file was encountered in the data file.Msg 7399, Level 16, State 1, Line 2The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.Msg 7330, Level 16, State 2, Line 2Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
  • When "Number of columns" is more than the total number of columns in table or in format file, the error is:
Msg 4823, Level 16, State 1, Line 2Cannot bulk load. Invalid column number in the format file
  • When "Number of columns" is any number between the number of mapped columns and the total number of columns in the table, the BULK INSERT worked fine.
Apart from the above mentioned cause, there are many other causes for the "unexpected end of line" error. Some of these are:

  • In the data file, one or more rows have less columns that the expected number of columns.
  • Wrong field terminator mentioned in format file or the BULK INSERT command
  • Wrong row terminator mentioned in format file or the BULK INSERT command




Thursday, August 25, 2011

Error trying to REORGANIZE/ REBUILD clustered index on CDC Enabled table

Recently I was trying to REORGANIZE a Change Data Capture enabled table when I received the following error:


Msg 22983, Level 16, State 1, Procedure sp_cdc_ddl_event_internal, Line 77
The unique index 'MyTable_PK' on source table 'MyTable' is used by Change Data Capture. To alter or drop the index, you must first disable Change Data Capture on the table.
Msg 3609, Level 16, State 2, Line 2
The transaction ended in the trigger. The batch has been aborted.

This table is very huge - more than 500 million rows. So the reorganize took a very long time - almost 19 hours. I was constantly monitoring the progress of the operation. And just when it was about 98% complete, I got this error. The frustrating part was that I got this error after 19 hours of running this command.


This was on SQL Server 2008 SP1 RTM. The error itself is a bug that was fixed in CU6 for SP1. So we either had to update to CU6 or look for a workaround. The workaround was to either use DBCC REINDEX or to use ALTER INDEX ALL on MyTable REORGANIZE

Obviously updating a Production SQL machine had to be planned for and was not going to happen immediately. And DBCC DBREINDEX would take the table offline - so that was also a no go. So the only option left was ALTER INDEX ALL. This was such a waste of resources as I had recently reorganized all the other indexes on the table. But since it was the only feasible option, I had to use it.

NB: The error does not appear when reorganizing a non-clustered index. Or may it DOES appear when reorganizing a non-clustered index on the unique key that is defined in CDC.

Bug Reference: http://support.microsoft.com/kb/977076/

Update: After the reorganize errored out, I checked the fragmentation on the index and found that REORGANIZE did do its magic. Thats logical since REORGANIZE does not roll back the work it has already done. And as I have mentioned the error happened at the end of reorganize. So most of the reorganizing was already complete by then.


Tuesday, August 16, 2011

Faster backups, restores & data file growths in SQL Server

I recently attended SQLSkills.com Performance Tuning SQL Server. One of the things we learnt about was configuration options that can affect performance. One such option is "Instant File Initialization". Instant File Initialization was introduced in SQL Server 2005. This feature allows file allocation requests to skip zero initialization. By default when new files are created for a certain size (say 1GB), SQL server will write zeros to the entire 1GB file and thereby "remove" any data still persistent on that 1GB space. This can take a huge amount of time when the file sizes are big. By enabling "Instant File Initialization", SQL Server skips the process of writing zeros in this space.

Check if this is enabled: Just one of the ways is to enable trace flag 3004 & 3605 and then create a new database. Once done, look at the SQL server log files:


Run following in SQL Mgmt Studio:
DBCC TRACEON (3004 );
DBCC TRACEON (3605 );


CREATE DATABASE testDB_IFI
ON
( NAME = testDB_IFI_dat,
    FILENAME = 'F:\data\testDB_IFI.mdf',
    SIZE = 10GB,
    MAXSIZE = 50GB,
    FILEGROWTH = 5GB )
LOG ON
( NAME = testDB_IFI_log,
    FILENAME = 'F:\data\testDB_IFI.ldf',
    SIZE = 5MB,
    MAXSIZE = 25MB,
    FILEGROWTH = 5MB )

DBCC TRACEOFF (3004 );
DBCC TRACEOFF (3605 );


Now look at the SQL Server log file. If you see foll. entries it would mean Instant File Initialization is NOT enabled:
Zeroing F:\data\testDB_IFI.mdf from page 0 to 1310720 (0x0 to 0x280000000)
Zeroing completed on F:\data\testDB_IFI.mdf

Conversely, if you DONT see these entries it means Instant File Initialization is enabled.

NB: Log files CANNOT be instant initialized. So you WILL see foll. entries irrespective of whether you have this enabled or disabled:
Zeroing F:\data\testDB_IFI.ldf from page 0 to 640 (0x0 to 0x500000)
Zeroing completed on F:\data\testDB_IFI.ldf

Enabling Instant File Initialization:
This is essentially a Windows feature and you can enable this feature in SQL Server by giving the user account under which SQL Server is running permission to access this Windows feature. The privilege to give is "Perform Volume Maintenance Tasks".

Resources:
Paul Randall's blog: How to tell if you have instant file initialization enabled and Follow-up post.
Kimberly Tripp's blog: "Instant Initialization: What, Why and How?"
Paul's Misconceptions around file initialization post



Friday, July 29, 2011

SQL Server Error: The partner transaction manager has disabled its support for remote/network transactions

I was getting foll. error when remotely executing a stored procedure over SQL Server Linked Server.


OLE DB provider "SQLNCLI10" for linked server "LINKEDSERVERNAME" returned message "The partner transaction manager has disabled its support for remote/network transactions.".
Msg 7391, Level 16, State 2
The operation could not be performed because OLE DB provider "SQLNCLI10" for linked server "LINKEDSERVERNAME" was unable to begin a distributed transaction.

After much research I found the solutions using these links:
http://www.sqlwebpedia.com/content/msdtc-troubleshooting

http://social.msdn.microsoft.com/forums/en-US/adodotnetdataproviders/thread/7172223f-acbe-4472-8cdf-feec80fd2e64/

Much thanks to Chuck Lathrope and Eric for these!

Create a linked server to itself in MS SQL Server 2008

On my testing environment we have transaction and reporting databases on same server. But in live environment, they are on different servers and connect to each other using Linked Servers feature. So to reproduce this environment on test server, I needed to create a Linked Server on test "transaction" DB server that would "Link" to the "reporting" DB server on the same physical DB server.

This proved impossible using Management Studio; but I was able to successfully do this using t-sql code:

EXEC sp_addlinkedserver @server = N'reportingserver',
    @srvproduct = N' ',
    @provider = N'SQLNCLI',
    @datasrc = N'SQL_Server_Instance',
    @catalog = N'DB_name'

Hope this helps someone.

Tuesday, June 28, 2011

INVALID_FIELD and INVALID_FIELD_FOR_INSERT_UPDATE error in Salesforce.com Professional Edition

Some time back I wrote a post on the INVALID_FIELD error that you may get when calling the Salesforce.com upsert() method. Today I want to elaborate on this issue a little more since Salesforce.com has changed a lot since I last wrote about this issue.

The INVALID_FIELD error or the  INVALID_FIELD_FOR_INSERT_UPDATE error is commonly seen in the Professional Edition of Salesforce.com when using the upsert() method. In professional edition, field security is defined by what fields are included on the page layout. If the fields are not a part of the page layout you cannot use those fields in the upsert() method. If they are not in the page layout, the API (or any apex code for that matter), would not see the field and any reference to it would result in error.

So to overcome this error, all you have to do is add these fields to the custom object's page layout. To do this:


  1. Select Your Name | Setup | Create | Objects.
  2. Choose the object.
  3. Scroll to the Page Layouts section.
  4. Click Edit next to the page layout you want to customize.
  5. Add a new "Section" to the page layout by dragging "section" to the page.
  6. Add fields that are throwing an error into this section.
Now you should be able to call upsert() on these fields without error.  Doing the above will also display these fields in the actual page layout for all users. If you don't want to do this, you can edit the page layouts of individual objects that use this layout to hide these columns.
 



Thursday, May 19, 2011

First Timer's cf.Objective() 2011 - Part 1

I attended the cfObjective() conference for the first time. The conference was at Minneapolis, MN from May 12-14. The 3 day conference was great with a lot of sessions across different topics.
Here are some numbers from the conferene:
  • Days: 3
  • Sessions: 75
  • Speakers: 55
  • Atendees: ~315
  • Sessions at a time: 4-5
With so many sessions and only so much  time, there was a lot I missed; at the same time there was a lot that I learned. Many of the presenters have made their presentations available. So that will be good to check out the sessions that I missed. If you were not able to make it to the conference and would like to check out what you missed, head over to http://www.cfobjective.com/index.cfm/sessions/

You can see the sessions list and if you click over to a session's details, you will be able to see the link to the presentations/ slides/ codes from the session.

That said, here are a few things I learned and that I plan on using. I have clubbed these into categories for easy reference.

ColdFusion related tools:
Other Productivity Tools: These are manily from Jim Priest's talk on Developer Toolbox:
Jim also has many other alternatives to the above tools. For the full presentation, a pdf hand-out (with a list of other alternatives) and some code (mainly config for AutoHotKey, etc), download his presentation package from: http://thecrumb.com/download/cfo-2011-toolbox.zip

More learnings coming in Part 2 of the post.

Thursday, April 21, 2011

Debugging ColdFusion Webservices

Debugging webservice calls in ColdFusion is almost always a painful experience. Recently I spent some time debugging a CF call to a .Net webservice that was expecting a ArrayOfInt as well as some custom data types. Here are some steps I went throuogh:


  • The first step is to find out if the call is getting to the webservice or is it being stopped by CF itself. You can use a network sniffing tool (like Wireshark) for this. Or you can enable request/response logging in ColdFusion. Here's how: Open client-config.wsdd (its usually at  C:\ColdFusion8\wwwroot\WEB-INF). You will see a requestFlow and a responseFlow section. Uncomment out this section. Restart CF. Now when you make a webservice call from CF, the raw request/response will be logged in cfusion-out.log (usually located at C:\ColdFusion8\runtime\logs). See if your call generates an entry in this file. If it does, you know that your call is getting through to the webservice endpoint. If you dont see any entry here, that would imply that the request is being killed by CF itself.

    NB: Do remember to comment out the requestFlow and responseFlow lines once you are done debugging.
  • If your request is not getting to the API webservice, chances are you are getting one of 2 errors:
Web service operation methodName with parameters ....... cannot be found. OR
java.lang.IllegalArgumentException: argument type mismatch
  • If you are getting "Web service operation sendMessageToGroup with parameters ....... cannot be found.", then you probably have a mismatch of the arguments you are passing to the service and the arguments that it is expecting. Depending on how you are calling the webservice (CFINVOKE, cfobject, named parameters versus non-named parameters, etc), ensure that the signatures match exactly. If using named parameters, ensure spellings are correct. If using non-named parameters, ensure the order of parameters is correct.
  • If you are instead getting "java.lang.IllegalArgumentException: argument type mismatch", there is probably a mismatch between the argument type that the webservice is expecting and the argument type you are passing to it. This is mostly the case when complex data types are involved. To troubleshoot this problem, here are a few useful steps:
  1. If you have a number of complex data types in the method arguments, your first step is to find out which argument is causing the mismatch. ColdFusion itself will not tell you which argument is causing the "mismatch". This is how I do it: Browse to the web service wsdl file in a browser. Save the file locally to your dev. environment (a place that is web accessible and where you can edit it). This is now your local copy of the wsdl. Open the wsdl file and locate the method thats causing you grief. If you have more than one complex data type (including arrays, custom data type, enums), remove the one you think is causing the problem; now in your CF code to call the service change the endpoint (wsdl location to your local wsdl) as well as adjust the method call accordingly.
    Run the CF page to see if the error changes. If it does you now know that this is the parameter that is causing the "argument type mismatch" error. If the error is still the same you know that the parameter you removed is not the one that is causing the error.
  2. Once you have identified the argument that is causing the error, next task is to find out what exactly is wrong. The best way is to look at the webservice stubs that Java is creating. There are 2 ways of doing this:
    a. Using WSDL2Java: This method has been documented fairly well. See Tom Jordahl' post or Adobe KB article.
    b. There's a simpler way. When you call a web service for the first time, ColdFusion automatically creates the stub files. These are usually located at: C:\ColdFusion8\stubs\
    Depending on how many webservice calls you have made there may be multiple folders here. You can browse thro' the folders to see which one belongs to the service you are troubleshooting. Or you can simply delete all folders, make your call and see which folder was created. Now dig thro' the folder and you will find a number of java class files. Each class file corresponds to something in your wsdl. Now if you try to open the class file, you will see garbage because class files are compiled java file. So to get the source behind the class file, you can use a java decompiler. There are quote a few out there. But I like one one here: http://java.decompiler.free.fr/ It has a GUI version as well as an Eclipse plugin. Once you open a class file you will be able to see the interface definitions as Java internally sees it. 
  • If you are getting a "java.lang.IllegalArgumentException: argument type mismatch" while calling a .Net webservice with ArrayOfInt or ArrayOfString as the argument type, then most probably you are not defining the datatypes correctly. The ArrayOfInt and ArrayOfString will lead you to believe that .Net is expecting an Array. But remember ColdFusion arrays are not the same as .Net arrays. If you look at the wsdl carefully, you will notice that the "ArrayOfInt" is defined as a complexType name.
    <s:complexType name="ArrayOfInt">
    <s:sequence>
    <s:element minOccurs="0" maxOccurs="unbounded" name="int" type="s:int" />
    </s:sequence>
    </s:complexType>

    And complex types in CF are mapped to Structures. Further, you will notice that this complex type has an element with name="int". Now when java looks at the wsdl and creates the stub class files, its renaming this element to "_int". My guess is that its doing this  because "int" is a reserved word in Java and also a native data type. So what this means is to successfully call the .Net webservice, you need:

    <cfset objGroupIds = StructNew()>
    <cfset objGroupIds._int = ListToArray("627303")>
    <cfset callResult = myObj.getUser(objGroupIds)>
Another great resource for understanding/ debugging ColdFusion webservices:
http://tjordahl.blogspot.com/2008/04/reprint-consuming-web-service-complex.html


Monday, March 21, 2011

Detecting Character Encoding of a Text File in ColdFusion

Recently I needed to detect the character encoding of text files (these file were uploaded by end-users) using ColdFusion. Since character encodings cannot be detected (the character encoding is defined by the creator of the file) it boiled down to guesstimating the char encoding. Did some research and found a great Java object called "juniversalchardet" - this is based on the encoding detection library of Mozilla. You can download the library from http://code.google.com/p/juniversalchardet/

ColdFusion code:
<cfscript> detector = createObject( "java", "org.mozilla.universalchardet.UniversalDetector").init(JavaCast("null", "")); myfile = FileOpen("C:\inetpub\wwwroot\MMBranches\EPNU\Web\TestStuff\doc_unicode4.txt" ,"readBinary");
while (! FileIsEOF(myfile) && !detector.isDone()) { // continue the loop if the end of file has not reached x = FileRead(myfile, 1); // read 1 kb binary data detector.handleData(x, 0, 1); // process this binary data.. } detector.dataEnd(); WriteOutput(detector.getDetectedCharset()); </cfscript>

  


Tuesday, March 15, 2011

IIS: Multiple SSL sites on single IP address

I recently had to set up multiple sites on IIS with SSL and using just a single IP address.  This is pretty straight forward with non-SSL sites - you just specify a different host header for each of the sites. But with SSL, this wont work by default. You have 2 options:
1. specify a different port number for SSL for each of the sits. The disadvantage is that your users will have to access the sites as: https://subdomain.daksatech.com:445/ (if SSL was set up on 445)
2. The other option is to run a script that comes with IIS to enable SSL to work with host-headers. Here's how:
  • Set up wildcard domain on server (IIS)
  • Add certificate to site.
  • Add SSL site on port 443.
  • Run foll. script in command prompt:
    cscript.exe adsutil.vbs set /w3svc//SecureBindings ":443:"
    Eg: cscript adsutil.vbs set /w3svc/844934796/SecureBindings ":443:blogger.daksatech.com"

How to get siteid: In IIS, go to site properties, edit log location, this should show the site id in the LogFile name path. Eg: if the log path is W3SVC234234/exyymmdd.log; here "234234" is the siteid.

Location of adsutil.vbs:

%SystemRoot%\system32\inetsrv\adminsamples

or 
%SystemRoot%\inetpub\adminscripts\



Sunday, February 27, 2011

Engineering Management at Facebook

Some ideas on managing engineering teams and scaling up engineering operations by Yishan Wong of Facebook. Five points from the post:


  • Hiring is THE most important piece 
  • Process should be implemented by those who practice it 
  • Promote from within 
  • Make tools your top priority - they increase productivity
  • All managers should be technically proficient 


Filtering Out Bad Developers

How do you filter out the poor developers in an interview. Imran suggests a simple "FizzBuzz" logic question.
Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.
Another simple question (from the comments in the post):
Show me (on paper) how you would swap the values of two variables.

Head over here for the full post. Also, check out the great comments on the post.

Distributed Database/ NoSQL Technologies

I have started looking into Distributed Databases to address some of the Big Data problems that I am working on. As part of this I have been reading & researching a few options like MongoDB and CouchDB. One great resource that I came across was a document titled "Survey of Distributed Databases" by Randy Guch.

This is a great starting point for anyone new to the "NoSQL movement". It gives an overview of all the different technologies that come under the NoSQL movement. And then goes on to cover some of the projects under each of these technologies. Overall a highly recommended read.