A covariance kernel for proteins

Jean-Philippe Vert

doi:10.1109/IJCNN.2004.1380902

A covariance kernel for proteins

Abstract

We propose a new kernel for biological sequences which borrows ideas and techniques from information theory and data compression. This kernel can be used in combination with any kernel method, in particular Support Vector Machines for protein classification. By incorporating prior biological assumptions on the properties of amino-acid sequences and using a Bayesian averaging framework, we compute the value of this kernel in linear time and space, benefiting from previous achievements proposed in the field of universal coding. Encouraging classification results are reported on a standard protein homology detection experiment.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…