Skip to main content

For a use case we need to run DQ on CSV's loaded into HDFS (only). So no hive externally managed tables are available. Our use case is much like the desktop tutorial Hadoop (01.0.0 Read whole folder - MAP.plan), but we'd like to create VCI's and update them once a day for profiling, DQE and sample data.

Two Q's:

  • How do we connect tot HDFS? (Simple JDBC we use in DBBeaver doesn't work)
  • How do we update the VCI's?

Hi @Marnix Wisselaar, thank you for posting! Could you please raise a Support ticket for this? Our team will be in touch as soon as possible. 


Can someone just document the process of creating VCIs for HDFS. 


Hi @msyed6, thanks for posting.  We do not support direct connection for HDFS directories. For VCIs(Virtual Catalog Items), you can find the documentation here as we can read files that way.

Please let me know if this helps 🙋🏻‍♀️


Reply