-1

I am writing matrix multiplication module in Verilog and I encountered an issue where expression evaluates to bunch of 'xxxx':

// multiplies 5x32 matrix by 32x5 matrix
module matmul(input [4959:0] A, input [4959:0] B, output reg [799:0] out);
    integer i,j,k;
    integer start = 0;

    reg [31:0] placeholder_A [4:0][31:0];
    reg [31:0] placeholder_B [31:0][4:0];
    reg [31:0] placeholder_out [4:0][4:0];

  always @(A or B) begin
      // initialize output to zeros
      for (i=0; i<800; i=i+1)
          out[i] = 0;

      // initialize placeholder output to zeros
      for (i=0; i<5; i=i+1)
        for(j=0; j<5; j=j+1)
          placeholder_out[i][j] = 32'd0;

      // turn flat vector A array into matrix
      for (i=0; i<5; i=i+1)
        for(j=0; j<32; j=j+1) begin
          placeholder_A[i][j] = A[start +: 31];
          start = start + 32;
        end
      start = 0;

      // turn flat vector B array into matrix
      for (i=0; i<32; i=i+1)
        for(j=0; j<5; j=j+1) begin
          placeholder_B[i][j] = B[start +: 31];
          start = start + 32;
        end
      start = 0;

      // do the matrix multiplication
      for (i=0; i<5; i=i+1) // A.shape[0]
        for(j=0; j<5; j=j+1) // B.shape[1]
          for(k=0; k<32; k=k+1) // B.shape[0] or A.shape[1]
            placeholder_out[i][j] = placeholder_out[i][j] + (placeholder_A[i][k]*placeholder_B[k][j]); // this is where I am having problems
      start = 0;

      // flatten the output
      for (i=0; i<5; i=i+1)
        for(j=0; j<5; j=j+1) begin
          out[start] = placeholder_out[i][j];
          start = start + 1;
        end
  end
endmodule 

placeholder_out variable (and therefore out output) are evaluated as 'xx...xxx' and I cannot understand why. When checking the signals through testbench both placeholder_A and placeholder_B contain valid values. Any help would be appreciated. You can run the testbench here: https://www.edaplayground.com/x/2P7m

Ach113
  • 1,138
  • 1
  • 10
  • 26
  • this means that something involved in the evaluation of the variable has not been initialized to a proper value. You need to debug the simulation by dumping waive forms, adding available print statement or by other means. It would be a good idea to provide the code for your test bench as well. – Serge Mar 04 '20 at 16:21
  • @Serge, testbench should be available in the link – Ach113 Mar 04 '20 at 16:41

1 Answers1

1

A couple of things that I observed from the code snippet. First of all the input is not having sufficient width. The required width is 32*5*5=5120. So we need input vectors of 5120 bits ( input [5119:0] A, input [5119:0] B). A linting tool might have caught this issue.

Secondly, the start needs to be initialized to zero at the start of computation. This will avoid latches on start and will compute from zeroth index of A and avoid X's to propagate further.

  always @(A or B) begin
  //...
    start=0;

I'd advise to use always_comb instead of manual sensitivity but that is an entirely different topic.

As a side note, the given code snippet will create large combinational hardware as per my understanding. You may want to check synthesis result for timing violations on different nets and apply some alternate logic.

sharvil111
  • 3,943
  • 1
  • 10
  • 27