Check file presence

Hi,

Sorry for a small question, but I could find a task that I could use to see if a file (.csv) exists.

Also if the file is present, I want to check if the file is for the current date which is first value in the first row.

Thanks,

Aleem

Hi,

I believe you mean do these steps in Automation module right? I think best would be use your CSV file and create connection in TDP. If file won’t exist you won’t be able use it. Then you can do other steps with SQL. Also you can use variables or SQL statement in SQL Editor activity.

Filip

Thank you Filip,

Please pardon my ignorance, as I am just starting to learn about TDP.

By CSV file connection, do you mean “Import CSV” or is it some other activity?

Also the filename is dynamic with current date in it.

Here is the scenario, User download file from partner’s website, give it an appropriate name with current date and place it in a folder for processing.

There are a few validations:

1- Validate that the date within the file is current date

2- All dates are in the required format

3- There are a couple of extra columns that need to be deleted (This I guess can be achieved by not mapping those columns)

4- The file has correct line terminator CR + LF

I am not sure if the Import wizard allows to use a dynamic name. So I was thinking if there is a way to check file presence, I could use that and then copy the file with a specific name into a working folder. The import process can then be kicked off.

Thanks,

Thanks,

Aleem

Hi,

Any ideas folks!!

Thanks,

Aleem

I have built a few ETL processes using TDP. I load csv and excel files into staging tables in two of the scripts. To simplify things, I use the Import Wizard to load the files into the staging tables, setting the truncate table option as I get complete files each time. I then use the RowCount variable from the Import Wizard step in an If Condition step, testing to see if I have any rows to process. If the file was missing, the staging table would be empty due to the truncate, and I would take that branch of the if based on row variable < 1, otherwise take the other branch (testing variable > 0) and process the data in the table. You can have a wildcard in the filename in the Import step, I do that for files I receive that have datetime in the file name (such as Data_File_*.csv).

You’ll have to map columns from the csv to columns of your target table, so you’ll have to map the first column (that has the date in it) to a column that allows null, since I assume your other rows will not have a value for the date. Then you could use SQL to get the value from the one row where date is not null. You could also use SQL to check the data pulled from the staging table against the current date.

As for validating dates are in a required format, they would need to be in some sort of valid date format to import properly into your staging table. If you need to verify that dates are usable before import, you’d need to write some sort of external script to do that and then call it using an Execute Script step, using the return code variable to determine what happened in the external script.

Yep, don’t map columns you do not need.

You’ll also need to use an external script to determine if the file has CR + LF (Windows) or just LF (Unix). You could incorporate the date check with the end of line characters check in a single script.

Thank you!

This is a good suggestion hopefully it will work for me too.

Thanks again

You can also call a batch file to pipe the file folder contents into a text file (DIR “P:\Projects\HEDIS\NDC_LoadFiles” > “P:\Projects\HEDIS\NDC.txt”) this will create a text file NDC.txt with the same format each time. You can create a Folder_Contents table to import the text file into (truncate and load each time). Then you can sql test the daylights out of it. count(*) where file name matches your string pattern will give the number of rows pull that into a variable and test it. Use an if condition step to branch one way when = 0 and another way when > 0. You can also put case statements in your sql to test the date part of the file name or anything else you want to return to an automation variable for later. You can put all this inside a loop that pauses for a specified period of time if the file count matching your sting pattern = 0 and then tries again after x number of minutes and quits after y number of tries. You can then test after exit of the loop to see if you reached max y or else found a file and branch accordingly (like send an email alter for no file found or continue load process). I attached a batch file for piping contents into a text file (It would not let me upload a .bat file so I added a .txt to the end of the name your file would need to end with .bat). You can use the Run Program step in automation to call the batch job.

Have fun you can automate anything,
GetFileList.bat.txt (149 Bytes)