Given a seed string I would like to extend it using a prefix hash and read list, I want to extend that string in this way:
- Input seed string $seed
- Extract the last kbase of that seed
- Check the prefix_hash for the reads in read_list (reads where the first k-base) is the same with the seed.
- Merge that reads to end of $seed
- Redo Step 2 for the merged reads until no ends can be extended
I'm stuck with my code below:
use strict;
use Data::Dumper;
use Carp;
my $k = 2;
my %readlist = (
    "read1" => "ACTGA",
    "read2" => "ACAAA",
    "read3" => "CTCGC",
    "read4" => "GAGGG",
    "read5" => "TTTCC",
);
my %prefix_hash = (
    # This is a hash of array (prefix hash). 
    # The key is the last base
    # And array is the read that contain that base
    # at first k position
    # in this case k = 2
    "AC" => ["read1","read2"],
    "AG" => ["read3"],
    "GA" => ["read4"],
    "TT" => ["read5"]
);
my $seed = "AAAAC";
my @newreads = extend_seed($seed);
sub extend_seed {
    my  $str    = shift;
    my @new_str;
    my $first_lastk_str = substr($str,-($k));
    print "$first_lastk_str\n";
    # I'm stuck here how can I recurse and merge
    return @new_str;
}
Given the example above. I want to get the following output
Intial      AAAAC
First_merge AAAACTGA  # Seed merge with Read 1 
            AAAAACAA  # Seed merge with Read 2
Last_merge  AAAACTGAGGG # Fir开发者_StackOverflow中文版st_merge merge with Read 4 
What's the way to go about it?
First, you need a merge_strings routine:
sub merge_strings {
    my ($x, $y, $k) = @_;
    return sprintf '%s%s', $x, substr $y, $k;
}
The routine assumes that the last $k characters of $x and the first $k characters of $y match.
use strict; use warnings;
use Data::Dumper;
my $k = 2;
my %readlist = (
    "read1" => "ACTGA",
    "read2" => "ACAAA",
    "read3" => "CTCGC",
    "read4" => "GAGGG",
    "read5" => "TTTCC",
);
my %prefix_hash = (
    "AC" => ["read1","read2"],
    "AG" => ["read3"],
    "GA" => ["read4"],
    "TT" => ["read5"]
);
my $seed = "AAAAC";
my @newreads = extend_seed($seed, $k, \%prefix_hash, \%readlist);
print Dumper \@newreads;
sub merge_strings {
    my ($x, $y, $k) = @_;
    return sprintf '%s%s', $x, substr $y, $k;
}
sub extend_seed {
    my ($x, $k, $prefix, $reads) = @_;
    my $key = substr $x, -$k;
    return unless exists $prefix->{$key};
    my @ret = map merge_strings($x, $_, $k),
                  @{$reads}{@{ $prefix->{$key} }};
    push @ret, map extend_seed($_, $k, $prefix, $reads), @ret;
    return @ret;
}
Output:
$VAR1 = [
          'AAAACTGA',
          'AAAACAAA',
          'AAAACTGAGGG'
        ]; 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论