Skip to main content
Log in

Rights protection of trajectory datasets with nearest-neighbor preservation

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Companies frequently outsource datasets to mining firms, and academic institutions create repositories or share datasets in the interest of promoting research collaboration. Still, many practitioners have reservations about sharing or outsourcing datasets, primarily because of fear of losing the principal rights over the dataset. This work presents a way of convincingly claiming ownership rights over a trajectory dataset, without, at the same time, destroying the salient dataset characteristics, which are important for accurate search operations and data-mining tasks. The digital watermarking methodology that we present distorts imperceptibly a collection of sequences, effectively embedding a secret key, while retaining as well as possible the neighborhood of each object, which is vital for operations such as similarity search, classification, or clustering. A key contribution in this methodology is a technique for discovering the maximum distortion that still maintains such desirable properties. We demonstrate both analytically and empirically that the proposed dataset marking techniques can withstand a number of attacks (such a translation, rotation, noise addition, etc) and therefore can provide a robust framework for facilitating the secure dissemination of trajectory datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Abbreviations

\({\mathcal{D}}\) :

Original dataset of trajectories

\({\widehat{\mathcal{D}}}\) :

Watermarked dataset

x :

Trajectory in time-domain

X :

Trajectory in frequency domain

n :

Number of points in a sequence

\({X_j = \rho_j e ^ {\phi_j i}}\) :

Fourier descriptor as a function of its magnitude and phase

p :

Embedding power

\({\widehat{X_j} = \widehat{\rho_j}e^{\widehat{\phi_j}i}}\) :

Watermarked Fourier descriptor as a function of its watermarked magnitude and phase

\({\mu_j(\mathcal{D})}\) :

Mean of ρ j across the trajectories in \({\mathcal{D}}\)

l :

Number of non-zero elements of watermark

χ :

Correlation

\({\widehat{D}_p(x,y)}\) :

Distance between two trajectories x, y after watermarking with power p

References

  1. Agarwal, P., Adi, K., Prabhakaran, B.: Robust blind watermarking mechanism for motion data streams. In: Proceedings of ACM Workshop on Multimedia and Security, pp. 230–235 (2006)

  2. Agarwal, P., Prabhakaran, B.: Tamper proofing mechanisms for motion capture data. In: Proceedings of ACM Workshop on Multimedia and Security, pp. 91–100 (2008)

  3. Aggarwal, C.C., Yu, P.S.: A condensation approach to privacy preserving data mining. In: Proceedings of EDBT, pp. 183–199 (2004)

  4. Agrawal, R., Kiernan, J.: Watermarking relational databases. In: Proceedings of VLDB, pp. 155–166 (2002)

  5. Agrawal, R., Srikant, R.: Privacy preserving data mining. In: Proceedings of SIGMOD, pp. 439–450 (2000)

  6. Aha D., Kibler D., Albert M.: Instance based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)

    Google Scholar 

  7. Atkeson C.G., Moore A.W., Schaal S.: Locally weighted learning. Artif. Intell. Rev. 11(1–5), 11–73 (1997)

    Article  Google Scholar 

  8. Bassia, P., Pitas, I.: Robust audio watermarking in the time domain. In: European Signal Processing Conference (EUSIPCO) (1998)

  9. Becker, M., Desoky, A.: A study of the DVD content scrambling system (CSS) algorithm. In: Proceedings of IEEE International Symposium on Signal Processing and Information Technology, pp. 353–356 (2004)

  10. Bertino E., Khan L.R., Sandhu R.S., Thuraisingham B.M.: Secure knowledge management: confidentiality, trust, and privacy. IEEE Trans. Syst. Man Cybern. A 36(3), 429–438 (2006)

    Article  Google Scholar 

  11. Brickell, J., Shmatikov, V.: The cost of privacy: destruction of data-mining utility in anonymized data publishing. In: Proceedings of SIGKDD, pp. 70–78 (2008)

  12. Chen B., Wornell G.: Quantization index modulation: a class of provably good methods for digital watermarking and information embedding. IEEE Trans. Inf. Theory 47(4), 1423–1443 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  13. Chen, B., Wornell, G.W.: Achievable performance of digital watermarking systems. In: IEEE International Conference on Multimedia Computing and Systems, pp. 13–18 (1999)

  14. Chen, K., Liu, L.: Privacy preserving data classification with rotation rerturbation. In: Proceedings of ICDM, pp. 589–592 (2005)

  15. Cheng Q., Huang T.: Robust optimum detection of transform domain multiplicative watermarks. IEEE Trans. Signal Process. 51(4), 906–924 (2003)

    Article  MathSciNet  Google Scholar 

  16. Cover, T., Hart, P.: Nearest Neighbor pattern classification. In: IEEE Trans. Inf. Theory, pp. 21–27 (1967)

  17. Cox I.J., Kilian J., Leighton T., Shamoon T.: Secure spread spectrum watermarking for multimedia. IEEE Trans. Image Process. 6(12), 1673–1687 (1997)

    Article  Google Scholar 

  18. Cox, I.J., Miller, M.L.: Electronic watermarking: the first 50 years. In: International Conference on Control, Automation, Robotics and Vision (2004)

  19. Cox I.J., Miller M.L., Bloom J.A.: Digital watermarking. Morgan Kaufmann, New York (2007)

    Google Scholar 

  20. Deshpande, P.M., D.P, Kummamuru, K.: Efficient online top-K retrieval with arbitrary similarity measures. In: Proceedings of EDBT, pp. 356–367 (2008)

  21. Fridrich, J.: Minimizing the embedding impact in steganography. In: Proceedings of ACM workshop on Multimedia and security, pp. 2–10 (2006)

  22. Fridrich, J., Pevný, T., Kodovský, J.: Statistically undetectable jpeg steganography: dead ends challenges, and opportunities. In: Proceedings of ACM Workshop on Multimedia and security, pp. 3–14 (2007)

  23. Green D., Swets J.: Signal detection theory and psychophysics. Wiley, New York (1966)

    Google Scholar 

  24. Information Hiding: Techniques for Steganography and Digital Watermarking. Artech House, Boston (2000)

  25. Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A new privacy-preserving distributed k-clustering algorithm. In: Proceedings of SIAM International Conference on Data Mining (SDM) (2006)

  26. Jin, X., Zhang, Z., Wang, J., Li, D.: Watermarking spatial trajectory database. In: Proceedings of DASFAA (2005)

  27. Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Proceedings of ICDM, pp. 99–106 (2003)

  28. Keogh, E., Kasetty, S.: On the need for time series data mining benchmarks: a survey and empirical demonstration. In: Proceedings of SIGKDD, pp. 102–111 (2002)

  29. Kesal, M., Mihcak, M.K., Venkatesan, R.: An improved attack analysis on a public-key spread spectrum watermarking. In: ACM Multimedia Systems Journal, pp. 133–142 (2005)

  30. Kifer, D., Gehrke, J.: Injecting utility into anonymized datasets. In: Proceedings of SIGMOD, pp. 217–228 (2006)

  31. Li, F., Sun, J., Papadimitriou, S., Mihaila, G., Stanoi, I.: Hiding in the crowd: privacy preservation on evolving streams through correlation tracking. In: Proceedings of ICDE, pp. 686–695 (2007)

  32. Li S., Okuda M.: Iterative frame decimation and watermarking for human motion animation. Int. J. Graph. Vis. Image Process. 07, 27–34 (2007)

    Google Scholar 

  33. Li, T., Li, N.: On the tradeoff between privacy and utility in data publishing. In: Proceedings of SIGKDD, pp. 517–525 (2009)

  34. Liu, L., Kantarcioglu, M., Thuraisingham, B.: The applicability of the perturbation model-based privacy preserving data mining for real-world data. In: ICDM International Workshop on Privacy Aspects of Data-Mining (2006)

  35. Liu, Y., Prabhakaran, B., Guo, X.: A robust spectral approach for blind watermarking of manifold surfaces. In: Proceedings of ACM Workshop on Multimedia and security, pp. 43–52 (2008)

  36. Lucchese, C., Vlachos, M., Rajan, D., Yu, P.: Rights protection of trajectory datasets. In: Proceedings of International Conference on Data Engineering, pp. 1349–1351 (2008)

  37. Maity, S.P., Kundu, M.K.: Robust and blind spatial watermarking in digital image. In: Indian Conference on Computer Vision, Graphics and Image Processing (2002)

  38. Malvar H., Florencio D.: Improved spread spectrum: a new modulation technique for robust watermarking. IEEE Trans. Signal Process. 51(4), 898–905 (2003)

    Article  MathSciNet  Google Scholar 

  39. Moulin, P., Mihcak, M., Lin, G.-I.: An information-theoretic model for image watermarking and data hiding. In: IEEE International Conference on Image Processing (2000)

  40. Moulin, P., Mihcak, M.K., Lin, G.I.: An information–theoretic model for watermarking and data hiding. In: Proceedings IEEE International Conference on Image Processing, pp. 667–670 (2000)

  41. Niu X., Shao C., Wang X.: A survey of digital vector map watermarking. Int. J. Innov. Comput. Inf. Control 2(6), 1301–1316 (2006)

    Google Scholar 

  42. Oliveira, S., Zaiane, O.: Privacy preserving clustering by data transformation. In: Proceedings of SBBD, pp. 304–318 (2003)

  43. Perez-Freire L., Perez-Gonzalez F.: Spread-spectrum watermarking security. Inf Forensics Secur. IEEE Trans. 4(1), 2–24 (2009)

    Article  Google Scholar 

  44. Pinkas, B.: Cryptographic techniques for privacy-preserving data mining. In: SIGKDD Explorations 4(2), pp. 12–19 (2002)

  45. Rastogi, V., Suciu, D., Hong, S.: The boundary between privacy and utility in data publishing. In: Proceedings of VLDB, pp. 531–542 (2007)

  46. Sagetong, P., Zhou, W.: Dynamic wavelet feature-based watermarking for copyright tracking in digital movie distribution systems. In: IEEE International Conference on Image Processing, pp. 653–656 (2002)

  47. Simitopoulos, D., Tsaftaris, S., Boulgouris, N., Strintzis, M.: Compressed-domain video watermarking of MPEG streams. In: IEEE International Conference on Multimedia and Expo (ICME) (2002)

  48. Sion R., Atallah M., Prabhakar S.: Rights Protection for Relational Data. IEEE Trans. Knowl. Data Eng. 16(12), 1509–1525 (2004)

    Article  Google Scholar 

  49. Sion R., Atallah M.J., Prabhakar S.: Rights Protection for Discrete Numeric Streams. IEEE Trans. Knowl. Data Eng. 18(5), 699–714 (2006)

    Article  Google Scholar 

  50. Solachidis V., Pitas I.: Watermarking polygonal lines using Fourier Descriptors. IEEE Comput. Graph. Appl. 24(3), 44–51 (2004)

    Article  Google Scholar 

  51. Swanson M.D., Zhu B., Tewfik A.H., Boney L.: Robust audio Watermarking Using perceptual masking. Signal Process. 66(3), 337–355 (1998)

    Article  MATH  Google Scholar 

  52. Thuraisingham, B.M., Khan, L., Subbiah, G., Alam, A., Kantarcioglu, M.: Privacy and security challenges in GIS. In: Encyclopedia of GIS, pp. 898–902 (2008)

  53. Topkara, U., Topkara, M., Atallah, M.J.: The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions. In: MM & Sec, pp. 164–174 (2006)

  54. UC Riverside Time Series Data Mining Archive. http://www.cs.ucr.edu/~eamonn/TSDMA/

  55. UCI Repository of Machine Learning Databases. http://www.ics.uci.edu/~mlearn/MLRepository.html

  56. Voyatzis, G., Pitas, I.: Chaotic mixing of digital images and applications to watermarking. ECMAST 2, pp. 687–694 (1996)

  57. Vaidya, J., Clifton, C.: Privacy-preserving K-means clustering over vertically partitioned data. In: SIGKDD (2003)

  58. Vaidya, J., Clifton, C.: Privacy preserving naive bayes classifier for vertically partitioned data. In: Proceedings of SDM (2004)

  59. Vlachos, M., Lucchese, C., Rajan, D., Yu, P.: Ownership protection of shape datasets with geodesic distance preservation. In: Proceedings of EDBT, pp. 276–286 (2008)

  60. Voigt, M., Yang, B., Busch, C.: Reversible watermarking of 2d-vector data. In: Proceedings of the Workshop on Multimedia and Security, pp. 160–165 (2004)

  61. Xu, Y., Ke Wang, A.W.-C.F., She, R., Pei, J.: Privacy-preserving data stream classification. In: Advances in Database Systems, pp. 487–510 (2008)

  62. Yamazaki, S.: Watermarking motion data. In: Proceedings of Pacific Rim Workshop on Digital Steganography, pp. 177–185 (2004)

  63. Yu, H., Jiang, X., Vaidya, J.: Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data. In: SAC, pp. 603–610 (2006)

  64. Yu, H., Vaidya, J., Jiang, X.: Privacy-preserving SVM classification on vertically partitioned data. In: Proceedings of PAKDD, pp. 647–656 (2006)

  65. Zhu W., Xiong Z., Zhang Y.-Q.: Multiresolution watermarking for images and video. IEEE Trans. Circuits Syst. Video Technol. 9(4), 545–550 (1999)

    Article  Google Scholar 

  66. Zmudzinski, S., Steinebach, M.: Psycho-acoustic model-based message authentication coding for audio data. In: Proceedings of ACM Workshop on Multimedia and security, pp. 75–84 (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Claudio Lucchese.

Additional information

This work is partially supported by the National Science Foundation under Grants No. IIS- 0914934.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lucchese, C., Vlachos, M., Rajan, D. et al. Rights protection of trajectory datasets with nearest-neighbor preservation. The VLDB Journal 19, 531–556 (2010). https://doi.org/10.1007/s00778-010-0178-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-010-0178-6

Keywords

Navigation